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l ae ee are signs on the intellectual scene that we. are moving out Of: an era in the social sciences eerie 


modernism — a belief that separating fact from value, truth from falsity, is just a i matter ofapplying the right .. 
version of method. The purpose of this paper is to introduce accounting researchers to a movement terined 
“deconstruction” which reflects the postmodern view that modernism is an. untenable philosophical 
position. Postmodern thought in general and deconstruction in particular demand self-reflection and 
abandon any desire to somehow “ground” knowledge in an external and transcendental metaphysic like the 
positivist’s faith in observation or the Marxist’s faith in historical determinism. Decohstruction differs from |. 
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the academic tradition in which competing metaphysics attack each other with their different dogmas’. . 


Instead, it works from within a research paper (text), taking an author’s own criteria for privileging his or `; 
her work, and then de-constructs the text by pointing out how the author violates his or her own system: f 
of privilege. In this study, we both introduce deconstruction and apply it to Michael Jensen’ s “Organization 
Theory and Methodology“ [The Accounting Review (April 1983) pp. 319-339], a text which would suggest 
that positive theory in accounting should be privileged over other ways of knowing and writing accounting 
discourses. We show through deconstruction that positive theory and the empirical tradition are not’, 
entitled to the kind of epistemic privilege and authority that they have enjoyed in silencing other kinds of 

writings about accounting. Deconstruction, then, is 2 moment of resistance to the reductionism: of. 
modernism and its desire for knowledge closure. It resists aman authori cát and restores life to its’ | - 





original difficulty before our obeissance to pe oe 


Firstly ... nothing exists; secondly ... even if anything 
exists, it is inapprehensible by man; thirdly . . . even ifany- 
thing is apprehensible, yet of a surety it is inexpressible 
and incommunicable to one’s neighbor (Gorgias, 5th 
century B.C.). 


An emerging body of critical accounting liter- 
ture subverts the mainstream view that knowl- 
edge of accounting is grounded in objectivist 
and foundationalist principles. Critical account- 
ing draws upon the arguments of contemporary 
intellectual thought and shares with scholars in 


many other fields a belief in the indeterminacy of - 


‘knowledge claims. Indeterminacy rejects the 
„notion that knowledge’ is externally grounded 





1 


(legitimated outside its own discourse) and is. 
revealed through systems of rules that are’ 


privileged over other ways of understanding and ` 


other approaches to me production of knowl- 
edge. 

In accounting, the prescriptions of positive’ 
theorists and empiricists are objectivist, and 
foundational and reflect a kind of naive realism. ` 


As a privileged theory of science, naive realism is | -. 
` intellectually untenable, a point well established ` ` 
-in the philosophy of science (Adorno, 1983; 


Bernstein, 1983, 1986; Gadamer; 1977, 1984; 
Goodman, 1978, 1984: Habermas, 1971,-1975; 


_Harre, 1986; Kuhn, 1970; Laudan, 1977; Polanyi, 


"We appreciate the comments of the anonymous reviewers, participants at the Project on the!Rhetoric of Inquiry Gedahan 
at the University of Iowa, accounting workshops at the University of lowa, University of Maryland, University of Missouri, 
Pennsylvania State University, University of Wisconsin-Madison, and conference participants at the 1987 annual tneetings of 
the American Accounting, Association, . Allied Social Science Association and the European ‘Accounting Association. The 
project was also aided by our time at University House, The University of Iowa, and va thank Jay Semet and our ‘University fe 


; House ee for their apport and Seer 
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1964, 1966; Popper, 1972; Putnam, 1978; Rorty, 
1979, 1982; Quine, 1980) and reiterated in the 
accounting literature (Arrington, 1987; Chua, 
1986; ChristenSon, 1983; Devine, 1985; Tinker 
et al., 1982). Further, as a theory about social in- 
stitutions, positive theory mistakes an abstract 
analytical apparatus (its theory of “the market”) 
for a particular empirical vision (see Unger, 
1986, pp. 11—14). Thus, positive theory orches- 
trates the “empirical” through its own vision of 
the moral order of society (the market), trap- 
ping itself in the only option available to humans 
— an awareness that the conduct of scientific in- 
quiry always proceeds with social “facts” and so- 
cial “values” codetermined. Baynes et al. (1987, 
p. 5) state: 


The object of knowledge is always already preinter- 

i preted, situated in a scheme, part of a text, outside which 
there are only other texts. On the other hand, the subject 
of knowledge belongs to the very world it wishes to inter- 
pret.... Thus the idea of the knowing subject disengaged 
from the body and from the world makes no more sense 
than the idea of self-transparence; there is no knowledge 
without a background, and that background can never be 
wholly objectified. 


But positive theory approaches its data as pro- 
viding value-free tests of its theories. The notion 
that empirical testing provides external and in- 
dependent evidence on the integrity of a theory 
is philosophically bankrupt and ignores our 
capacity to construct realities. Felperin (1985, 
pp. 27—28) asks, “Since when has a paradigm, 
from that of Ptolemy to that of Einstein, failed to 
discover or generate the particulars to fill itself 
out?” Thus positive theory can be subverted 
through critical attention to presuppositions 


about both its status as science and its claim to 


produce empirical evidence about “the way the 
world is” independently of its own values about 








“the way the world should be.” Despite its intel- 
lectual naivete, positive theory is a powerful in- 
fluence on accounting research and graduate 
education in the U.S. today. 

‘This paper introduces a philosophical praxis 
termed deconstruction (Derrida, 1976, 1978, 
1979, 1982, 1983, 1986, 1987) as one moment 
of the turn toward critical accounting research 
and employs deconstructive strategies to de- 
center the prescriptions of positive theory as set 
out in Jennsen (1983). Deconstruction chal- 
lenges an author’s attempt to privilege a theory, 
technique or model as a superior way to arrive at 
closure around knowledge. Along with the criti- 
cal historical practice of Michel Foucault, a prac- 
tice beginning to surface in the accounting liter- 
ature, deconstruction challenges efforts to re- 
duce the complexity of human existence to “sys- 
tems” of explication, systems orchestrated by 
the author’s theories, values, and presupposi- 
tions. For example, the work of Hopwood 
(1987), Miller & O’Leary (1987), Loft (1986) 
and Hoskin & Macve (1986, 1987) employs 
Foucault’s praxis as a way to rewrite the history 
of accounting, to subvert the belief that account- 
ing develops functionally as a passive tool of 
economic efficiency. Instead, the history of ac- 
counting is interpreted as a complex web of 
economic, political and accidental co-occurr- 
ences that mirror neither technical rationality 
nor a necessary progress. 

Deconstruction shares: with Foucauldian 
scholarship a desire to question Enlightenment 
models of rationality, theory, facticity and his- 
tory as progress.' This critical stance is perhaps 
best described by the intellectual movement 
termed postmodernism (see Lyotard, 1984), an 
attempt to arrive at a “new beginning” for under- 
standing the nature of human inquiry.” Post- 


' Nietzsche viewed rationalist systems like science as “merely a prolongation of theology” and a continuation of “the longest 
” ... that there is something out there beyond ourselves to establish first principles of right thinking. In Rorty’s (1982, p. 
208) view, the work of postmodern thinkers like Derrida can serve as a kind of freedom, “an attempt to free Mankind from 
Nietzsche's ‘longest lie’ [what Derrida terms the “onto-theological” tradition], the notion that outside the haphazard and 
*perilous experiments we perform there lies something (God, Science, Tere Rationality, Truth) which will, only if we 


perform the correct rituals, step in to save us.” 


2 Postmodernism emerges in the wake of modernity’s crisis which, for Lyotard, is the loss of faith or meaning in the grand 
narrative that legitimates (among other things) science. Lyotard (1984, p. 28) notes “The state spends large amounts of 
money to enable science to pass itself off as an epic: the State’s own credibility is based on that epic, which it uses to obtain 
the public consent its decision makers need.” Lyotard goes on to say (p. 41 ) “That is what the postmodern world is all about. 
Most people have lost the nostalgia for the lost narrative.” In the specific problem concerning the legitimation of science, 


‘Lyotard states (p. 29): 
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modernism takes seriously the intellectual posi- 
tion that any metascheme for the production of 
knowledge can neither ground itself in anything 
other than an act of faith nor can it establish its 
capacity to insure that its own program can solve 
more problems than it creates. That knowledge 
in the human sciences is problematic arises from 
an awareness that the powers of Enlightenment 
rationality, which were originally turned to mas- 
tery of nature, have now been turned to the mas- 
tery of human beings by other human beings. In 


` Foucaults (1977a) terms, the production of 


i) 


knowledge is never separable from the exercise 
of power. For example, the concept of economic 
rationality is almost never questioned as a motif 
of progress (at least in the U.S. accounting litera- 
ture). Yet one can point to examples which 
suggest that the unidimensional application of 
economic rationality in decision making is a 
major force behind human disease, death, 
famine and even “natural disasters” (see Lamb, 
1982, p. 4). 

Thus postmodernism generally and decon- 
struction particularly holds nothing sacred, 
seeks to question every assumption and presup- 
position offered under the rubric of “knowl- 
edge,” and, most importantly, challenges any 
authorfitv]'s claim to a “method” of knowledge 


‘production that is privileged over others. But it 


is not a mere relativism (to be a relativist re- 
quires a grounding in something to be relative 
to) — the texts of deconstruction engage rigor- 
ous philosophical and intellectual arguments to 
challenge objectivism and foundationalism. De- 
construction bores within texts and de- 
monstrates how the legitimacy of any claim to an 
external metaphysical grounding (beyond the 
text itself) is supported by only the linguistic 
and rhetorical, perhaps sophist[icated], strate- 
gies of the author. The purpose of deconstruc- 
tion, then, is to subvert the attempt to get clo- 
sure around knowledge production — the at- 
tempt to silence other voices by illicitly claiming 
to possess a superior awareness of “truth.” The 
subversive impulse makes deconstruction 
polemical, a political act designed to critique 
and dismantle intellectual elitism. Felperin says: 


Its [deconstruction’s} polemic is directed not against one 
school or another, but against the purist or imperialist 
tendency of them all, their motivating belief that persis- 
tence in theory (their own in particular) will resolve the 
problems that have beset and debilitated past practice 
rather than throw up new ones just as debilitating (1985, 
p. 1). f 


Our purpose in introducing deconstruction to 
accounting is two-fold. First, we believe that ac- 
counting has the capacity to construct realities 


With modern science, two new features appear in the problematic of legitimation. To begin with, it leaves behind the 
metaphysical search for a first proof of transcendental authority as a response to the question: “How do you prove the proof?” 
or, more generally, “Who decides the condition of truth?” It is recognized that the conditions of truth, in other words the rules 
of the game of science, are jmminent in that game, that they can only be established within the bonds ofa debate that is already 
scientific in nature, and that there is no other proof that the rules are good than the consensus extended to them by the 
experts. 

As an example of the loss of nostalgia or faith in the grand narrative, consider the recent sharp decline on the New York 
Stock Exchange. Over a 2 day trading period (16 October 1987 and 19 October 1987) the Dow Jones Industrial Average 
dropped 26.2%, but over the next 2 days recovered nearly half the loss in value. Why did such violent price swings occur 
over such a short time period? No one really knows. Tbe Wali Street Journal (23 October 1987, p. 8) reported that efficient 
market theorists were “totally perplexed,” and The New York Times (25 October 1987, section 4, p. 1) said “In an era of 
specialized expertise and of computer-driven explosion in information, the experts and the computers neither predicted the 
sickening collapse or explained it with any conviction or consistency.” The New York Times (23 October 1987, p. 1) 
suggested nine “discouraging signs” (fighting in the Persian Gulf, bearish advice, foreign market volatility, fear of panic, trade 
and budget deficits, German monetary policy, flight to gold, flight to bonds, doubts about Presidential leadership) and six 
“encouraging signs” (cut in prime interest rate, halt in computerized program trading, intervention of the Federal Reserve 
Bank, a healthy economy, a stable dollar, foreign capital growth). Fifteen stories, some in conflict with one another, that 
provide little understanding of why such a dramatic event occurred, and which taken together reveal that the grand narrative 
of market efficiency is in shambles. The Wall Street Journal (23 October 1987, p. 13) quotes Robert Shilier, a Yale economist, 
as saying “The efficient-market hypothesis is the most remarkable error in the history of economic theory. This is just another 
nail in its coffin.” 
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in a manner that dictates the conditions of 
human life and that current theories of account- 
ing are infused with unexamined commitments 
. to particular moral and social orders. Thus, the 
practice of accounting and theorizing about that 
practice are atways and already informed by 
ethics which help to create the material condi- 
tions of human lives. To deny the value-laden- 
ness of one’s theorizing is to deny responsibility 
for the consequences of one’s theories. Decon- 
struction can serve as a practice oriented toward 


forcing those value commitments to the surface’ 


of our “scientific” practice and lead us to ques- 
tion our research on ethical grounds. Our sec- 
ond purpose is to subvert the pretensions of 
positive theory as a theory of knowledge pro- 
duction. We argue that this school of accounting 
research exercises undue influence on the pro- 
duction of.acocunting knowledge. This influ- 
ence is due to many factors; among them: (1) ac- 
counting scholars’ unwillingness to critically 
examine the political, ontological, metaphysical 
and epistemological assumptions that underlie 
research, and (2) specific institutional arrange- 
ments for the production and dissemination of 
accounting knowledge which form a “market” 
for accounting research that is driven by factors 


beyond the intellectual competence of the re- 
search. Thus, accounting research is less expan-- 


sive and less intellectually rigorous than it could 
be because of the disciplining forces of a 
hegemonic academic elite. The theories prop- 
osed by this elite also reflect an extremely con- 
servative political perspective on the role of ac- 
counting in producing the social order. Our dual 
purposes, then, are designed to hold positive 
theory intellectually accountable and to make 
clear the fact that knowledge production is al- 
ways a political act. We find deconstruction a 
useful praxis for these purposes, just as Foucaul- 
dian historical exegeses are useful in subverting 
the notion of accounting as a technical rational 


tool at the disposal of progress.’ 





PLACING DECONSTRUCTION INTO 
[CON]-TEXTS 


Knowledge as human discourse 

Every claim to knowledge is a discourse, a 
text, and is both a product of human manufac- 
ture and inseparable from the language which 
gives it expression. The first point means that 
“facts,” “evidence” and “theory” are never ap- 
proached independently of human values which 
are, always, logically prior to them. For example, 
the positive theorists claim to separate knowl- 
edge of “the way the world is” from opinion 
about “the way it should be” ignores the fact that 
a complex web of values, political beliefs, 
ideologies, and opinions inform the construc- 
tion of any theory and the design of any experi- 
ment. The second point means that knowledge 
production is always a product of telling a story, 
and the “factual” content of that story is never 
separable from the duplicities of language and 
the rhetorical strategies which support it — žit- 
erality and literarity co-occur. Every author at- 


‘tempts to persuade (or perhaps seduce ) readers 


into accepting his or her text as believable. The 
positive theorist’s claim to facticity and to a 
superior way of knowing are rhetorical strate- 
gies, successful because of their persuasive prop- 
erties rather than their intellectual credibility. 
The deconstructive project is designed to sub- 
vert any attempt to deny the codetermination of 
fact and human value and disclaims an ability to 
write within a foundational language (like em- 
piticism) assumed to represent “reality” best. 
What deconstruction attempts to destroy is the 
claim of any rule-bound.system of knowledge 
production to unequivocally dominate another, 
the attempt to close off the scholarly conversa- 
tion by privileging one’s own discourse over 
others. 

These two themes, the denial of the codeter- 
mination of facts and human values and the de- 
nial of the rhetorical character of knowledge, are 


3 While in this paper we provide an introduction to some of the key concepts of deconstruction and demonstrate their 
application, we only scratch the surface of deconstruction. We trust that the implications of the deconstructive program will 
be sufficiently clear to lead interested readers to further study. Before engaging deconstruction, we will provide some 


philosophical background for its context. 
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both contained within positive. accounting 
theory. Very simply, positive theory assumes 
that the application of a kind of naive empiricism 
is the key to unlock meanings assumed to reside 
in “nature” independently of the influence of the 
researcher. “The way the world is” is assumed to 
be both objectively knowable and to precede 
and produce the language in which it is de- 
scribed. By this account, language is not prob- 
lematic since it is a mimetic representation of 
“reality.” Positive theory thus requires a 
metaphysical and romantic belief which 
grounds knowledge in the assumption of an ob- 
servational presence and a correct observational 
language that mimetically reports upon that pre- 
sence. It is a naive view that has been seriously 
compromised by twentieth century develop- 
ments in philosphy and language, philosophy of 
science, mathematics (Godel’s theorem) and 
even physics, where developments like Heisen- 
berg’s uncertainty principle and quantum 
mechanics make it impossible to believe in an 
absolute, realist discourse about “the way the 
world is.” Suffice it to say that, as umpires in the 
game of knowledge production, we can no 
longer credibly argue that “we call them the way 
they are.” 


The meiapbystcs of presence 

The attempt to externally ground or originate 
knowledge prior to its production through lan- 
guage and human purpose is what Derrida 
(1976) terms metaphysics of presence, a 
“logocentric, ultimately religious, superstitious, 
or nostalgic, impulse to ground or centre dis- 
course in an originary author, response in a unit- 
ary subject, and textuality on a re-presentable 
world, when all are nothing other than effects 
and functions of linguistic differences” (Felpe- 
rin, 1985, p. 35). For purposes of deconstructing 
positive theory, the grounding of “textuality ona 
re-presentable world” (empirical testing of the 
“data” as grounding the “truth” or “falsity” of a 
theoretical discourse) is, the relevant meta- 
physic of presence. Our deconstruction of this 





metaphysic of presence will move through reve- 
aling how the prescriptions of positive theory 
function linguistically rather than foundationally 
and cannot purge themselves of the rhetorical 
and ideological commitments which make 
them, as with any discourse, illicit in claiming 
status as either a privileged way of producing 
knowledge or as grounding a about. 
“the way the world is.” — 

The notion of linguistic differencing nae is 
central to deconstruction originates from the 


pre-eminent concern with language in twentieth 


century thought. With a few exceptions, ac-. . 
counting researchers have paid little attention to: 
the role of language in the production of knowl- 
edge, and, as a result, it is necessary to provide a 
brief background before moving on to decon- 
struction. The background which follows 
situates ideas about language in the context of 
positive accounting theory and deconstruction. 


Situating knowledge as conditional upon 
language 

To return to baseball, our second umpire is a 
bit more scientifically sophisticated in that he 
“calls them the way he sees them” rather than as- 
suming the capacity to call them the way they 
are. To draw the philosophical analogy, the first 
umpire is situated as a positivist (see Abbagnano, 
1967) while the second might represent, among 
many others, a logical positivist (see Janik and 
Toulmin, 1973). Just as the second umpire rec- 
ognizes the error of assuming that his observa- 
tions map isomorphically to a “truth” apart from 
himself (nature’s own balls and strikes), logical 
positivists recognized that truth claims are al- 
ways conditional upon a prespecified language. 
Troubled by the fact that the positivists’ 
privilege of observation left mathematics and 
logic in a secondary role, the logical positivists 
separated synthetic truth statements, verifiable 
by the language of observation, from analytic 
truth statements, verifiable by pure reason. Since 
then, it has been impossible to return to the view 
that knowledge is independent of language, and 


This is from a well-known story about three baseball umpires and how they call balls and strikes. The first calls them as they 
are; the second calls them as he sees them; the third replies that they “ain’t nothing ‘til I call them.” 
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the implications of language have" since ex- 
tended far beyond the synthetic/analytical lan- 
guages of logical positivism. In Eagleton’s (1983, 
p. 97) terms, “Language, with its problems, mys- 
teries and „implications, has become both 
paradigm and obsession for twentieth-century 
intellectual life.” 

‘A second important historical moment is the 
predominantly French intellectual movement, 
structuralism, which grew out of Saussure’s 


* (1916) pioneering work in linguistics. Saus- 


sure’s work dealt with spoken language and the 
distinction between sound (signifier or sign) 
and. concept (signified). Saussure rejected the 
mimetic grounding of sound signs in concepts to 
which they referred; rather, he argued that signs 

“operate by differencing themselves from other 
signs. For Saussure, the sign c-a-r produces 
meaning only by differing from c-a-p or b-a-r. 
Structuralism more generally is concerned with 
the manner in which meaning in any sign system 
depends on the system of differencing among 
signs in that system. Thus in written language, 
the sign (word) is only a linguistic reference, not 
a referent to a phenomenal object with a 
grounded meaning which the sign serves. With 
structuralism, to signify through language be- 
comes a strictly linguistic differencing that “pro- 
duces” meaning within texts only through the 
process of differencing. 

Structuralism can be illustrated with an ac- 
counting example. In positive theory, prefer- 
ences for alternative accounting methods are 
argued to be associated with a firm’s political vis- 
ibility. Specifically, large firms have an incentive 
to select accounting methods that produce 
lower numbers.to protect themselves from 
political attacks on the magnitude of their pro- 
fits, Hypotheses are then tested through using 
firm size (total assets, etc.) as a “sign” for the con- 
cept of political visibility. Yet in other empirical 
research, firm size is a sign for other concepts (to 
classify organization structures, to measure sec- 
urity price returns as a function of firm size, 
etc.). The empirical test of firm size does not 
provide transcendental evidence on political 
visibility. Rather, the meaning of political visibil- 
ity derives from the theorist’s choosing to signify 


political visibility rather than to signify, say, or- 
ganizational structure or other concepts related 
to size. The “data” in all studies that use firm size 
are the same; the differences in the studies are 
choices about what signifiers to rhetorically 
employ. The “presence” of political visibility is 
only signified by the “absence” of alternative 
signs —- not organizational structure, not abnor- 
mal security returns, etc. The theorist has not . 
provided evidence about “the way the world is” 
so much as he has told a story that may or may 
not be persuasive. We can now replace our 
second umpire with a third, a structuralist, who 
says “they ain’t nothing ’til I call them.” 


Deconstruction as exegesis and subversion 

While structuralism moves us to knowledge as 
a process of differencing among signs, decon- 
struction goes further. A deconstructive umpire 
would perhaps not question the importance of 
the structuralist signature (“they ain’t nothing 
til J call them”), but would challenge the 
privilege attached to a “strike zone” rather than 
a “ball zone” or the notion of any “zone” and the 
reasons why the practice of “umpiring” is the 
field (ground) of intellectual activity with only 
male umpires. For Derrida, structuralism is itself 
an attempt to occupy the center rather than ex- 
tend its own awareness of the “free play” of sig- 
nification to its limit. In Derrida’s terms, to play 
out the limits of structuralism while giving up 
the desire for the center (the “zone” ) is to recog- 
nize the infinite possibilities of language and 
knowledge: 


... there was no center, that the center could not be 
thought in the form of present-being, that the center had 
no natural site, that it was not a fixed locus but a function, 
asort of nonlocus in which an infinite number of sign sub- 
stitutions came into play. This was the moment when lan- 
guage invaded the universal problematic, the moment 
when, in the absence of a center or origin, everything be- 
came discourse ... that is to say, a system in which the 
central signified, the original or transcendental signified, 
is never absolutely present outside a system of differ- 
ences. The absence of the transcendental signified ex- 
tends the domain and the play of signification infinitely 
(Derrida, 1978, p. 280). 

... the entire history of the concept of structure ... must 
be thought of as a series of substitutions of center for 
center, as a linked chain of determinations of the center. 
Successively, and in a regulated fashion, the center re- 
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ceives different forms or names. The history of 
metaphysics like the history of the West is the history of 
these metaphors and metonymies (Derrida, 1978, p. 
279). 


Derrida would deny us the comfort that 
attaches to beliefs that there is something out- 
side of ourselves that shares responsibility for 
our discourse, whether that something is logic, 
empiricism, God, or, close to our hearts, the in- 
visible hand that guides the “market” as an 
acceptable vision of the moral order. This know- 
able “something” that originates in “nature” out- 
side ourselves is depicted by Derrida’s term 
logocentrism, a belief in origins, in the order and 
rationality of things, and in the knowability of 
“truth.” Bernstein (1987, p. 5) says, “It is as if 
Derrida is telling us that the deepest destre in the 
Western philosophical tradition has been to lo- 
cate some fixed permanent center, some Ar- 
chimedian point, some ground ...” Derrida’s 
work on the exegesis of texts informs us of the 
ways that meanings are constructed within texts 
and to de-construct is, above all else, a subver- 
sion of the hegemonic authority of a text to 
speak of “truth” as originating outside the text. 


Deconstructive readings of texts reveal how un-- 


ruly and unstable meaning is and efface the veil 
of linguistic law and order we place over texts. 
By contrast, modernist accounting discourses 
deny their discursive textuality, deny their con- 
structivist origins, and present themselves as 
originating outside themselves mimetically re- 
presenting “nature.” 

Attempts to occupy the center, to privilege 
one’s discourses, make knowledge production 
always and already a violent and political act de- 
signed to “arrest” the other. This battle for the 
center is a history of one system using its own ar- 
guments to attack the arguments of another sys- 





tem. For example, Marxist accountants attack 
positivists for their reductionism, mystification, 
and reification. Positivists attack everybody who 
does not employ positivist “methods” for a fai- 
lure to provide “correct” empirical evidence. In 
short, these systems drop bombs on each other 
— they explode the enemy. Deconstruction 
takes an author’s own system of grounding and 
reveals how his or her text violates that system 
— it bores from within; it implodes. Norris 
(1982, pp. 18-19) says: 


Derrida refuses to grant philosophy the kind of privileged 
status it has always claimed as the sovereign dispenser of 
reason. Derrida confronts this claim to power on tis own 
chosen ground. He argues that philosophers have been 
able to impose their various systems of thought only by , 
ignoring, or suppressing, the disruptive effects of 'lan-, 
guage. His aim is always to draw out these effects by a crit-- 
ical reading which fastens on, and skillfully unpicks, the 
elements of metaphor and other figurative devices at 
work in the texts of philosphy (emphasis added). 


This paper will apply basic concepts of decon- 
struction to a text which we believe is reflective 
of the most powerful claim to “the center” in U.S. 
accounting discourse, Michael Jensen’s (1983) 
“Organization Theory and Methodology.” This is 
a text that prescribes research discourse and 
privileges a specific libertarian microeconomic 
empirical vision as representative of “the way 
the world is.” The deconstruction will proceed 
through a close reading of this text and de- 
monstrate that the positive theorists’ claim to 
privilege is both illicit and dangerously capable 
of closing off the conversational space of ac- 
counting research. 

Before proceeding, some caveats are in order 
about deconstruction itself. The discourse of de- 
construction is about the center, requires the 
center, but cannot claim the center. It would 


5 Logocentrism is a neologism from the Greek word logos. The importance of logos. to Western metaphysics (and to 
language ) can be seen in the opening verse of St John. In the English translation the word “Word“ is substituted for Jogos in 


the original Greek text: 
In the beginning was the Word, and 
the Word was with God, and the Word 
was God. The same was in the beginning 
with God. 


By the Biblical account, language is transcendental and reveals the word of God (see Handeiman, 1987 for a discussion of 
deconstruction in the context of Hebraic and Hellenic thought). 
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‘perhaps like another system but recognizes as 
:Foucault does that “... to imagine another sys- 
tem is to extend our participation in the present 
system” (Foucault, 1977b, p. 230). Deconstruc- 
tion is inescapably bound within the logocentric 
discourse of the Western intellectual tradition. 
There is a Derridean double-bind, then, in that a 
discourse of de-centering requires always and al- 
ready the presence of the center, and in this 
sense deconstruction is about yet cannot escape 
- from the center and the metaphysics of pre- 
sence. Parasitically, 2 host is needed. Derrida 
(1978, pp. 280-281) says: 


There is no sense in doing without the concepts of 
metaphysics in order to shake metaphysics. We have no 
language — no syntax and no lexicon — which is foreign 
to this history; we can pronounce not a single destructive 
proposition which has not already had to slip into the 
form, the logic, and the implicit postulations of precisely 
what it seeks to contest. , 


One way that Derrida resists the logocentric 
pressure on his own deconstructive discourses 
is through pushing the play of language into 
idiosyncracies, apparent but canny obscuran- 
tism, elliptical styles and experiments with 
phantom voices. He writes words under erasure 
(using them only to cross them out to gesture to- 
ward their necessity and their impotence); rup- 
turing the language with neologisms such as dif- 


ference, trace and supplement; juxtaposing two 


seemingly unrelated texts through split writing 
as in Glas; bracketing signs so as to suspend sig- 
nification, and punning and parody that self-ef- 
faces the seriousness of his texts by refusing to 
be serious about them in spite of their serious- 
ness. 

Any deconstructive text, then, can be sub- 
jected to deconstruction, largely because of its 
admission to the impossibility of any writing to 


transcendentally ground itself in the center. De- 


construction is a moment of negation which 
privileges the “play” of meaning and dissemina- 
tion over the closing off and inscription of mean- 





ing — proliferation around knowledge rather 
than closure around truth. Knowledge ‘is in- 
terpretive, hermeneutical rather than centered 
on global and logo-centered truths. Rather than 
what is said, then, an open deconstructive 
discourse is more concerned with what is not 
said. It is this absence of absoluteness (even a 
stochastic absoluteness), absence of theory, 
absence of positive programs and absence of 
method[ologies] that sets off postmodernism 
and deconstruction from the author[ity] of mod- 
ernism and logocentrism. Thus deconstruction 
does not want to be any-THING, does not want 
to be a Method. A Method (capital M) is a proper 
noun, a prescriptive program that instructs 
people in what they should do. Positive theory 
and empiricist privilege in accounting research 
are these kinds of normative structures. Decon- 
struction is a method (no capital) in the sense of 
a process of revealing the inadequacy of any 
Method to serve this role of a dispenser of rules 
for the production of knnowledge. [For a 
critique of the modernist privilege of Method in 
economics research, see McCloskey (1983, 
1984, 1985).] . 

` But deconstruction is intimately involv 

with language and preserves a role for rhetoric. 
It would likely accept that most authors want to 
persuade others and would share Norris’ (1982, 
p. 61) claim that “Behind all the big guns of rea- 
son and morality is a fundamental will to per- 
suade which craftily disguises its workings by 
imputing them always to the adversary camp. 
Truth is simply the honorific title assumed by an 
argument which has got the upper hand — and 
kept it — in this war of competing persuasions.” 
Norris argues that it is a Nietzschean view which 
recovers rhetoric from philosophy since, “Only 
by suppressing its origins in metaphor had phil- 
osophy, from Plato to the present, maintained 
the sway of a tyrannizing reason which in effect 
denied any dealing with figural language” (p. 
61). Thus rhetoric, in both classical senses of ar- 
gument and tropes, is always and already at the 


É Derrida is a hermeneuticist in the general sense of being concerned with critical interpretation and textual exegesis. But in 
conttast to Gadamer (1977, 1984), Derrida is a radical hermencuticist more concerned with opening up the play and 
dissemination of meaning than with the question of self-understanding that characterizes Gadamer’s project (see Caputo, 


1987). 
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core of knowledge production, and the restora- 
tion of rhetoric is cause for celebration, creating 
a wedge for creativity, imagination, and “natural” 
skill.” Norris continues: “Reason had crushed out 
the imaginative life of philosophy ... as the pale 
destroyer of all that gives life, variety and zest 
to the enterprise of human understanding. To 
restore that buried tradition is to.show how 
‘reason’ usurped its place by systematically 
opposing and disguising the rhetorical gambits 
of language” (pp. 57—58).8 

One might ask of what possible benefit is a sys- 
tem that aims only at epistemological negation, a 
system that provides no solutions. To answer 
this, we return to the polemic of deconstruction 
— it is directed at overcoming the imperialist 
tendency of all Theory, resisting the author[ity] 
of logocentrism, structure and totality — the 
permanent fixed center — that.can beget intel- 
lectual totalitarianism and tyranny (Lyotard, 
1984). Caputo (1987, p. 7) responds to the criti- 
cism that deconstruction is nihilist and anarchis- 
tic by inverting the “positive” in modernism 
with the “negative” in deconstruction and radi- 
cal hermeneutics: 


The point is to make life difficult, not impossible — to 
face up to the difference and difficulty which enter into 
what we think and do and hope for, not to grind them to 
a hait. Indeed, it is the claim of radical hermeneutics that 
we get the best results by yielding to the difficulty in “rea- 
son,” “ethics,” and “faith,” not by trying to cover it up. 
Once we stop trying to prop up our beliefs, practices, and 
institutions on the metaphysics of presence, once we 
give up the idea that they are endowed with some sort of 
facile transparency, we find that they are not washed 
away but liberated, albeit in a way which makes the guar- 
dians of Being and presence nervous. Far from abandon- 
ing us to the wolves, radical hermeneutics issues in far 
more reasonable and indeed less dangerous ideas of rea- 





son, ethics, and faith than those that metaphysics has 
been peddling for some time now. Curiously enough, the 
metaphysical desire to make things safe and secure has 
become consumately dangerous. 


The positive moment of deconstruction, then, 
lies in restoring life to its original difficulty 
which modernism and metaphysics have veiled 
and caused us to forget. As Ryan (1984, p. 8) says 
“To affirm the abyss deconstruction opens in the 
domain of knowledge is politically to affirm the 


- permanent possibility of social change.” 


ORGANIZATIONAL THEORY, 
DECONSTRUCTION, AND ACCOUNTING 
RESEARCH: A CASE STUDY OF 
MICHAEL JENSEN'S “ORGANIZATION 
THEORY AND METHODOLOGY” 


We have chosen organizational theory and its 
relationship to accounting as the province ofour 
illustration of deconstruction, and use Michael 
Jensen’s (1983) “Organization Theory and 
Methodology” as the text for our deconstructive 
exegesis. Organizational theory is particularly 
appealing because of the many diverse views 
present. The views range from organizations as 
neatly ordered units where strategic planning 
and order lead to nirvanic profits, to “organized 
anarchies,” which mysteriously function effec- 
tively. The role of accounting under the various 
views is totally different. In the orderly case, the 
accounting system enhances success through 
planning and control. In the anarchic case, ac- 
counting systems (as we conceptualize them in 
our textbooks) may be dysfunctional (see 
Cooper et al., 1981). However, Jensen’s text. 
holds out the possibility that all of the complex- 


7 Deictics refers to discourse as without objective referents since objective referents are conditional upon language. Godzich 
(1986, pp. xv—xvili) says, “Deixis is the linguistic mechanism that permits the articulation of all these distinctions between 
the here and the there, the now and the then, the we and the you. It establishes the existence of an ‘out there’ that is not an 
‘over here,’ and thus it is fundamental to the theoretical enterprise. It gives it its authority.” Most importantly, it means that 
one does not have to choose sides with either the realists or the nominalists; it doesn’t matter which one Is right since we 


can speak as if they are both right. 


8 This same kind of “freedom” emerges in the revival of the hermeneutic tradition in accounting (see Boland, 1985). It 
recovers thought from the positivist reduction to information and recovers understanding from its reduction to rational ` 


decision making. 


10 C. EDWARD ARRINGTON and JERE R. FRANCIS 


ity surrounding organizations can be robustly 
represented with the microeconomic argu- 
ments that he advances, 

This section of the paper will utilize Derrida’s 
concepts of aporia, differance, supplementarity 
and trace to show how Jensen’s text violates it- 
self by deviating from its own premises. As with 
any text which grounds itself in some external 
criteria, which privileges itself as a dispenser of 
rules, deconstruction will take the author liter- 
ally and reveal how the text dismantles itself in 
the light of violations of its own rule system. In 
this case, Jensen’s positive theory vanishes 
through revealing those “extra-positive” rhetori- 
cal moves Jensen must make. We have struc- 
tured this deconstruction around a select set of 
these rhetorical moves that we believe capture, 
but do not in any way exhaust, the ways in which 
Jensen’s positive theory rests on a bed of figural 
language and rhetorical argument. The point 
here is not to discredit empirical accounting re- 
search. Rather it is about the rhetoric which pos- 
itions Jensen’s brand of positivism’as a privileged 
epistemology and philosophy of science (see 
footnote 21). For the reader not familiar with 
Jensen’s text, it should be read prior to the re- 
mainder of this paper. 


Move I. The rhetoric of revolution and the 
aporta of positive theory 

Jensen sets out to describe a positive theory of 
organizations, one which represents the founda- 
tions of a “revolution in the science of organiza- 
tions” (p. 39, Abstract). [See Shils (1972) for a 
discussion of the rhetoric of revolution.] He 
speaks of the development of “a” theory of 
organizations as if there isn’t one (or many) 
already and of the accomplishments of his own 
. theory in the future tense as if inevitable; indeed, 
as required — “Because such positive theories as 
these are required for purposeful decision mak- 
ing, their development will provide a better 
scientific basis for the decisions of managers, 
standard-setting boards, and government reg- 
ulatory bodies” (p. 319, Abstract). 

Let’s take this last sentence through a bit of 
deconstruction. What can one conclude from a 
literal reading of the sentence? First, the “theory” 


is required for purposeful decision making. 
Since the development of the “theory” lies in the 
future [“The foundations are being put in place,” 
“the development of a theory of organizations 
will be . . .” (p. 319, Abstract)] and, since it is re- 
quired for “purposeful” (rational? consistent? 
profitable?) decision making, then we literally 
conclude that whatever decisions are now and 
have been made in the past are not “purposeful” 
since “purposeful” must await the development 
of the theory. But this theory is a positive theory, 
based on “knowledge about how the world be- 
haves” (p. 320). Clearly, the data for such a 
theory of decision making in organizations must 
come from observation of existing decisions 
which are not yet purposeful since they can only 
become purposeful with the development of the 
positive theory. Thus, the data-generating pro- 
cess is not capable of providing evidence from 
“the way the world is” to support, in its scientific 
sense, the theory of organizations that Jensen has 
in mind. Further, these “nonpurposeful” deci- 
sion makers must be either making their deci- 
sions randomly or already acting upon incorrect 
“normative” theories since they are awaiting the 
development of a positive theory. Alternatively, 
if they are using their knowledge of “the way the 
world is,” then they either don’t need a positive 
theory or they already have one. 

Here we have a classic Derridean aporia, a self- 
engendered paradox. Aporia derived from the 
Greek word meaning unpassable path, and ‘in 
Derrida’s hands it works to reveal the manner in 
which an author's philosophical, logical or scien- 
tific mandates must at some point burst from the 
pressure of the rhetorical configurations which 
can no longer be contained within the wrap- 
pings of those mandates. Ir is at these points of 
aporia that a text’s arguments are seen to depend 
upon something other than logical consistency 
since the aporias are themselves logical para- 
doxes. 

The discussion above demonstrates how the 
privileging of the positive over the normative in 
Jensen’s theory engenders its own contradic- 
tion. Positive theory relies for its privilege upon 
observation of the way the world is; that is, it will 
study actual decisions. But this positive theory is 
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required for purposeful decision making. Thus, 
decisions are not yet purposeful. So we will 
develop positive theory from a population of 
unpurposive decisions. Then decisions will 
become purposeful. Such is the conclusion 
based on Jensen’s own positivistic insistence on 
literalness, on the correspondence of sensations 
(visual awareness of things, words, etc.) with a 
“presence.” The text constructs its own aporia. 
Jensen’s concept of the “real,” the “positive,” has 
only rhetorical (normative?) not literal (posi- 
tive?) certification. It depends foremost upon an 
act of faith that Jensen can go outside of the data 
and find a “normative” structure to convert “the 
way the world is” to the way that he would like 
it to be. The “positive” is useful only to the ex- 
tent that it provides evidence that things are not 
quite right in “reality,” but some view of “right- 
ness” must lie within Jensen’s values and is both, 
necessary and normative. 


Move IL. The normative/ positive dichotomy 
and differance 

Differing and dichotomies. The privileging of 
positive over normative in accounting research 
depends foremost upon its appeal to facts (re- 
vealed through empiricism/positivism) to dis- 
tance itself from values. The dichotomy norma- 
tive/positive moves linguistically through the 
metaphysics of presence and the belief in a 
capacity of positive theory to operate indepen- 
dently of the normative. The positive/normative. 
dichotomy has been the most important aspect 
of privileging in recent accounting literature. 
Claiming to be — or being labelled — normative 
is a quick ticket out of many journals these days. 
Derrida and deconstruction have much to say 
about the manner in which such dichotomies or 
binary oppositions operate rhetorically, and we 
will turn that analysis to the positive/normative 
dichotomy in accounting research. 

Derrida views the privilege attached to the 
superior term in a binary opposition as a rhetor- 
ical move required to carry an argument 
through in the face of the argument’s inevitable 
failure to ground itself outside of the text. As 
Derrida states: “In a traditional philosphical op- 
position we have not a peaceful coexistence of 


facing terms but a violent heirarchy. One of the 
terms dominates the other (axiologically, logi- 
cally, etc.), occupies the commanding position. 
To deconstruct the opposition is above all, at a 
particular moment, to reverse the hierarchy” 
(1981, pp. 56-57). In our case, it would reverse 
the positive/normative by pointing to the nor- 
mative within the positive; that is, it effaces the 
dichotomy. It uses the negative space created by 
the normative to show how the positive exists 
only parasitically. Its affirmative status exists 
only because of its denial of the normative. 

In establishing the normative/positive dichot- 
omy, Jensen states that “Answers to normative 
questions always depend on the choice of the 
criterion or objective function which is a matter 
of values” (p. 320). He then completes the 
privileged side with “Answers to positive ques- 
tions, on the other hand, involve discovery of 
some aspect of how the world behaves and are 
always potentially refutable by contradictory 
evidence” (p. 320). For Jensen, then, the pre- 
‘sence of objective functions (goals, values, etc.) 
always and already places one in the domain of 
normative theory. Thus, to remain independent 
of normativeness, positive theory must deny 
itself objective functions (goals, values, etc.). 
What would such positive theory look like? With 
no goals, no values, no objective function, the re- 
searcher would have no basis to decide what “as- 
pect of how the world behaves” to investigate. 
As with John Barth’s Jacob Horner, the existen- 
tial angst of choice would leave one trapped in 
the arbitrariness of choice in an indifferent uni- 
verse that provides an infinity of choices. 

We tried to get some “positive” evidence of 
our own by running some organizational vari- 
ables through a regression without a specified 
objective function, without a dependent vari- 
able, since Jensen’s operational definition of 
positive theory mandates the absence of object- 
ive functions. We even tried it two ways, first 
with an unspecified dependent variable (Y) and 
secondly with Y assigned missing values. This is 
what we got: 


Error: Variable Y not found. 
Proc Rec Data = File 1; 
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Model y = Xi X2; 

Note: SAS stopped processing this step because of errors. 
Proc Reg Data = File 1; 
Model Y = X1 X2; 

Error: Less than 2 observations. 


To distinguish the positive from the normative 
on the basis of the presence of an unspecified 
objective function is overwhelmingly more 
irrational than the views of those thinkers mis- 
leadingly labelled by some as “irrational” (e.g. 
Feyerabend, 1975; Rorty, 1979). 

Temporality, difference and differance. Jen- 
sen quickly follows these definitions of the nor- 
mative and the positive with a statement that 
suggests his own awareness of the presumptu- 
ousness of the dichotomy. But, to save it he in- 
jects temporality into his argument: “In the end, 
of course, we are all interested in normative 
questions; a desire to understand how to accom- 
plish goals motivates. our interest in these 
methodological topics and in positive theories” 
(p. 320, emphasis added). The innocent, subor- 
dinate, tiny phrase — “In the end” — structures 
an illusion that one can defer the normative; an 
ex anie positivism can work prior to the inevita- 
ble normative — that policy decisions should 
await the privilege of “science,” of “positive 
theory,” to indicate what should be done. 

Derrida is aware of how dichotomies depend 
on not just spatial “differing” but also temporal 
“deferring.” Derrida coins the term differance, a 

‘complex concept which, nonetheless, expli- 
cates the dependence of privileging terms on 
both space (like the discussion of the normative/ 
positive above ) and on temporality. Its sense de- 
pends upon the two senses of the French verb 
differer, which inscribes the dual meaning “to 
differ” and “to defer.” Signification (the use of 
language) relies upon both of these meanings. 
First, to differ is to set up dichotomous terms like 
the “positive” and the “normative.” Their “differ- 
ing” force is atemporal. To “defer” suggests that 
rhetorical privileging of one term depends not 
only upon a differential “spacing” of terms but 
also “... the quest for a rightful beginning, an 
absolute point of departure, a principal responsi- 
bility” (Derrida, 1982, p. 6). 


For Jensen, the temporal ordering of positive 
to normative is desired. By deferring the norma- 
tive one is assumed to conduct positive science 
from an objective, value-free posture — the 
myth ofa neutral beginning, an absolute point of 
departure.. But this produces the same kind of 
aporia as the atemporal “differing.” The goals and 
values which originate research activity, which 
suggest research questions, are inextricably 
bound up with the research activity. How one 
wants to come out, what kind of belief structures 
(economics, etc.) inform the research 
enterprise, the “Received View” of a disciplinary 


Matrix, the theoretical vogues of the day, all 


make deferral of the normative impossible. As 
Melville (1986, p. 53) states, “the ‘subjectivity of 
the subject’ is found to lie outside of or prior to 
the subject. Every subject must be said to 
emerge into an already constituted, already 
structured world.” This means that every theory 
and discourse begins with the already present 
values of the theorist, and thus never speaks “for 
nature” independently of those values. There is 
no neutral posture from which to originate posi- 
tive research. Marxists are well aware of this . 
with their notion of ideology as was Einstein 
(1953, p. 38). 

So Jensen is right on the mark when he says 
that in the end we are all interested in normative 
questions, but he leaves out the beginning and 
the middle. To the extent that all research is pur- 
poseful, “reality” is malleable enough, “nature” is 
sufficiently pluralist, and language is rhetorically 
powerful enough to permit the writing of any 
text about “the way the world is” that we choose. 
But we must choose, and choice is neither neut- 
ral nor natural. Thus the normative character of 
research and theory is “always and already” pre- 
sent, a phrase Derrida and others use frequently 
to make clear that it is not a matter of choice, that 
normativeness is not avoidable. Jensen must 
either be subject to the same indictment, or, 
through some “God’s-eye view,” some transcen- 
dental birthright, he can emerge on the scene to 
produce “correct” observations on reality. The 
dichotomy normative/positive vanishes in the 
face of both the impossibility of such differences 
and the impossibility of temporally deferring 
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one until completion of the other.’ 


Move HL Supplementarity: bandaging the 
aporetic 

The inability of positive theory to stand alone 
causes Jensen to inject all sorts of what Derrida 
terms “supplements” into his analysis. Within 
deconstruction, as Culler (1982, p. 103) states: 


The supplement is an inessential extra, added to some- 
thing complete in itself, but the supplement is added in 
order to complete, to compensate for a lack in what was 
supposed to be complete in itself These two different 
meanings of supplement are linked in a powerful logic, 
and in both meanings the supplement is presented as cx- 
terior, foreign to the “essential” nature of that to which it 
is added or in which it is substituted. 


Observation, supposedly complete in itself, 
must be supplemented by Jensen with “defini- 
tions” and “tautologies.” This is not to suggest 
that such supplements can be avoided; they are 
always necessary. Rather, it is to point out that 
privilege claimed by authors for positive theory 
(or any other theory) becomes questionable 
with the presence of supplements. From a Derri- 
dean perspective, there is no unsupplemented, 
primary, first principle, only a (normative) de- 
sire for it, or a myth creating it (Leitch, 1983, p. 
171). Specifically, it is difficult to determine why 
some definitions and tautologies are acceptable 
for Jensen while others are not except on norma- 
tive grounds. What are the boundaries on the in- 
evitable definitions and tautologies that could 
possibly tell us when we have a positivist and 
when we have a normativist? Jensen goes to 
great length to defend his use of tautologies and 
definitions in building a positive theory of the 
firm. These may be viewed as supplements to the 
privileging of observation that gives positive 
theory its claim to superiority over normative 
theory. His defense of these supplements is an in- 
teresting rhetorical study. Observation is sub- 





verted by using “extra-observation” grounds to 
suggest what kinds of observations are approp- 
riate. 

Jensen proposes two “useful” tautologies: sur- 
vival of the fittest and the minimization of 
agency costs. He defines-a tautology as follows: 
“In the language of science, a tautology is a state- 
ment that is true by definition and can never be 
refuted by evidence” (p. 329). He excuses him- 
self from the rigor of philosophy with the caveat 
“Philosophers have a precise definition of tautol- 
ogy. I use the term here more loosely and more 
in accord with its use in the social sciences” 
(footnote 14, p. 329). 

Where are we now? We know that Jensen’s 
tautologies are not “philospher’s tautologies” 
and that they have some other (unspecified) use 
in the social sciences. We do not know why 
these two tautologies are more appropriate than 
some others. The reader is left to fill in the gaps. 
The Oxford English Dictionary provides five en- 


. tries, all of which entail tautology as concerned 


with repetition, not definition. Its etymology isa 
compound of the Greek “tauto” and “logica” 
defined as “same” and “speech or reason” res- 
pectively; hence, same (repetitive) speech or 
reason. In analytic philosophy, where science is 
more likely to get its definitions, tautology is a 


‘law that can be shown on the basis of certain 


rules to exclude no logical possibilities. The En- 
cyclopedia of Philosophy formally defines a 
tautology as “A compound proposition that is 
true no matter what truth-values are assigned to 
its constituent propositions. Thus, ‘A or not-A’ is 
a tautology, since if ‘A’ is true, then the whole 
proposition is true, and if ‘A’ is false, then ‘not-A’ 
is true and by definition the whole proposition is 
still true.” Thus a tautology in either its analytic 
philosophy sense or its common usage is not a 
definition, or a statement that is true by defini- 
tion, which is how Jensen uses it.’° 


? Similarly, with Watts & Zimmerman (1979), whose positive theory is grounded in self-interest, one cannot take seriously 
their “truth” statements since they are claiming responsibility for saying only what maximizes their own self-interest. 


1 Doxology: (a) The utterance of praise to God, thanksgiving (b) A short formula of praise to God, esp. one in liturgical use., 
spec. the Gloria in excelsis or “Greater Doxology”, the Gloria Patri or Lesser Doxology”, or some metrical formula, such as 
the verse beginning “Praise god from whom all blessings flow” [Oxford English Dictionary]. 

Doxy: Opinion (esp. in religious or theological matters) (QED). 

Tauto-dox[y]ology: Jensen's term is in here somewhere. QED. 
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Survival of the fittest and minimization of 
agency costs within Jensen’s positive theory of 
“how the world is” act as supplements which en- 
sure that his theory leads to the kinds of conclu- 
sions that follow from his political and economic 
beliefs.’ In lifting tautology out of its “scientific” 
context of analytic philosophy and appropriat- 
ing it for positive theory, he [mis}uses science 
that is championed as a privileged way of speak- 
ing (writing). This is particularly so since tautol- 
ogy is essential to the early twentieth century 
work in analytic philosophy that was designed to 
clean-up the nineteenth-century brand of naive 
positivism which comes closest to Jensen’s own 
view. Further, in a formal tautology, no pos- 
sibilities are excluded. Yet Jensen praises posi- 
tive theory (supplemented by tautologies) 
because “Answers to positive questions ... are 
always potentially refutable by contradictory 
evidence” (p. 320) while “normative proposi- 
tions are never refutable by evidence” (p. 320). 
Ironically, then, a positive theory that relies 
upon tautologies (philosopher’s or other’s) is al- 
ways and already a normative theory since it can- 
not be refuted by evidence. 

Jensen's (mis appropriation of tautology also 
raises interesting questions about the evidence 
from his own research into organizational 
behavior. Given the (mis )use of tautology, spec- 
ifically, survival of the fittest, given that no pos- 
sibilities are excluded by a tautology, any evi- 
dence would (or could) be consistent with the 
tautology. Jensen laments that “The word tautol- 
ogy has strong pejorative overtones ...” (p. 
330). Yet within the scriptures of “good” science 
that Jensen uses for privilege, there is a good rea- 
son for these “overtones.” The rhetoric of 
science depends, above all else, on the possibil- 
ity of negation, on preserving the nonpresence 
of the hypothetical relationships that one as- 
serts. The concern in empirical science over 
tautologies can be redoubled to Jensen’s own 
earlier warning about “incorrect” positive 





theories. Jensen states: “Furthermore, using in- 
correct positive theories or ignoring important 
constraints leads to decisions that have unex- 
pected and undesirable outcomes” (p. 321). We 
are not sure what Jensen means by “incorrect” 
positive theories or “undesirable” outcomes, but 
resting a positive theory atop a tautology that is 
irrefutable by evidence strikes us as a good can- 
didate. In any case, the history of science reveals 
a junkpile of previously “correct” theories that 
are now deemed “incorrect”, theories ranging 
from the Ptolemaic solar system to phrenology. 
A correct positive theory is thus always already 
incorrect if we take seriously that “data” has al- 
ways been shown to be a fallible guide to evi- 
dence about “the way the world is.” 


Move IV. Self-interest and survival of the fittest 
as normative ethics 

Jensen shares with other positive accounting 
theorists a faith in self-interest, Adam Smith’s in- 
visible hand, as the motive for human behavior 
that has “good” social consequences. Self- 
interest is a panchreston, a universal panacea, a 
term so broad that it is meaningless. It is a form 
of what Wayne Booth (1974) terms Motivism, a 
belief that before observing a specific behavior 
one knows why it and all other behaviors took 
place.'? It is one of Kenneth Burke’s God-terms. 
It rationalizes all kinds of undesirable behaviors 
like avarice — even suicide and starvation. It is 
an ethic as in Ivan Boesky’s comforting message 
to the business students at Berkeley — it’s alright 
to be greedy (California Magazine, May 1987, 
p. 57). But, most importantly, it is impossible to 
falsify, to demonstrate its absence; thus it is im- 
possible to build a positive theory around it — it 
defies refutation. 

Jensen re-writes self-interest as survival of the 
fittest and says it “completes most of the major 
building blocks of the analytical framework for 
creating a theory of organizations” (p. 331). 
Analytical? Self-interest and survival of the fittest 


-" “Survival of the fittest” as a tautology is a euphemism for the political beliefs in deregulation and libertarian economics that 
are the heart of the Rochester program. It is a way to veil political and normative rhetoric with a rhetoric of science. 


- ' Watts & Zimmerman (1978, p. 113) observe the highest standard of “motivistic” argument when they state“... 
or, more forcefully, “... 


that individuals act to maximize their own utility ...” 


we assume 
the only accounting theory that will provide a 


set of predictions that are consistent with observed phenomena is one based on self interest (1979, pp. 300-301). 
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have at least as much ethical import as they do 
“analytical” import. (See Lehman, 1987 for a 
critique of the implications of such an ethic for 
accountants). As an ethic, it conjures the ghost 
of Thomas Hobbes and the law of the jungle. We 
will do some empirical analysis of our own 
below to make clear the kinds of “undesirable 
outcomes” that might result from a positive, 
economic approach to self-interest which omits 
other ethical dimensions of human behavior. 

Economists are fond of studying cases in the 
limit in order to glean their generalities. We 
might look for positive evidence of a society that 
is grounded in a limit case of self-interest. 
Turnbull (1972) tells of such a society. There is 
a Ugandan mountain tribe, the Ik, whose morals, 
whose humanity, extend only to taking what- 
ever they can in order to survive. “Economic 
interest is centered on as many individual 
stomachs as there are people, and cooperation is 
merely a device for furthering an interest that is 
consciously selfish ... the Ik have dispensed 
with the myth of altruism, but they have also 
largely dispensed with acts that in reality served 
at least mutual interests” (Turnbull, 1972, p. 
157). Children are turned out at age three. “It 
was rather commonplace, during the second 
year’s drought, to see the very young prying 
open the mouths of the very old and pulling out 
food they had been chewing and had not had 
time to swallow” (Turnbull, 1972, p. 261). A dis- 
abled child, after social ridicule (humor), re- 
turned home. Her parents accepted her; locked 
her in the home; left; she starved. Old people are 
left to die alone. 

This story is not as remote as we might think. 
In a recent commentary upon the American 
C.E.O., Prokesch (1987, p. 1) states: 


... corporate survival cannot be taken for granted, So, 
survival must now be the chief executive's overriding 
concern ... The new order eschews loyalty to workers, 
products, corporate structure, businesses, factories, 
communities, even the nation. All such allegiances are 
viewed as expendable under the new rules. With survival 
at stake, only market leadership, strong profits and a high 
stock price can be allowed to matter. 


“Luckily the Ik are not numerous ... so I am 
hopeful that their isolation will remain as com- 


plete as in the past, until they die out completely. 
I am only sorry that so many individuals will 
have to die, slowly and painfully, until the end 
comes to them all” (Turnbull, p. 285). 

We doubt that positive theorists would find 


- such a society desirable, but this society has very 


successfully both minimized agency costs and 
based itself on the other “useful tautology,” sur- 
vival of the fittest. The point is that, from a de- 
constructionist perspective, what kinds of ethi- 
cal supplements are necessary to positive theory 
in order to take us away from the limit case of the 
Iks? 


Move V. More supplementartty: licensing poor 
science ` 

Just as Jensen is unwilling to accept responsi- 
bility for his use of tautology, he wants exemp- 
tion from the demands intellectual history 
places on the use of an important term like posi- 
tive. He again appeals to the term positive within 
the mysterious black box of usage “in the social 
sciences” (p. 320). Ina related footnote he states 
“The use of the term ‘positive’ in this context has 
had the unfortunate effect of linking accounting 
researchers who have been engaged in the effort 
to develop ‘positive’ theories with ‘logical 
positivism,’ a school of thought in philosophy 
which has been controversial. The proposal to 
focus on positive theories of accounting does 
not commit those who propose it to logical 
positivism” (footnote 1, p. 320). But this dis- 
claimer leaves one washed ashore between an 
earlier positivism, approximated by Jensen’s 
view, and the later logical positivism, a system 
which at least addressed the epistemic implica- 
tions of the language problem. It is as if we are 
expected to return to granting licences to avoid 
the linguistic and rhetorical. Logical positivism 
is neither an irrelevant nor a controversial supp- 
lement to positivism as is implied by Jensen. It 
“corrected” the mistakes of early positivism and 
did much to set twentieth century philosophy 
on its linguistic course. Jensen thus denies re- 
sponsibility for both language (writing) and the 
implications of language for the knowledge 
claims that his positivie theory would assert. 

Recall that the logical positivists (see Ayer, 
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1959) reclaimed a status for mathematics and 
logic astride the positivists’ “observations.” For 
them, “analytic” statements were defended as 
truth claims based upon “pure reason.” Alterna- 
tively, “synthetic” propositions were grounded 
in experience. It is ironic how the early 
positivists’ exclusion of analytical knowledge re- 
writes itself in Jensen’s attempt to privilege his 
own positive agency literature over the 
mathematical principal-agent literature. He at- 
tempts this privilege by an appeal to the empiri- 
cal and a dilution of the. mathematical. As he 
states: “The issue boils down to an empirical 


question regarding how useful the preference, . 


stochastic structure, and information structure 
variables are in explaining observed contracting 
practices. The positive agency literature pro- 
ceeds on the implicit assumption that the vari- 
ables emphasized in the principal-agent litera- 
` ture are relatively unimportant in understanding 
the observed phenomena when compared with 
richer specifications of information costs, other 
apsects of the environment, and the monitoring 
and bonding technology” (p. 335). One cannot 
accept the privileging of the observational (the 
synthetic) over the analytic (mathematico-logi- 
cal) after logical positivism. Nor can one accept 
the privileging of either of the two over other 
kinds of languages after the developments of the 
last 50 years that gave logical positivism, in Jen- 
scn’s terms, its “controversial” status. Jensen 
would have us believe that logical positivism is 
controversial because of its refusal to privilege 
observation. But the attempt to privilege obser- 
vation is what logical positivism corrected. Logi- 
cal positivism is not controversial with respect 
to what preceded it, as Jensen implies, but what 
came after it; that is, the development of what 
Rorty (1979) calls “the linguistic turn” in mod- 
ern philosophy. The point here is that Jensen’s 
naive realism is a weak form of positivism which 
was a philosophical movement that ignored the 
role of language and privileged observation. Log- 
ical positivism and all philosphical schools sub- 
sequent to it recognized the fact that observa- 
tion was not a privileged way of knowing and 
that knowledge claims were conditional upon 
language. Calling logical positivism controver- 


sial from a positivist perspective is like calling 
Einstein controversial from a Newtonian one. 


Move VI. Jensen re-wrttes ASOBAT: 
on intertextuality and traces 

Derrida’s notion of intertextuality and the 
“trace” provides another key to the inseparabil- 
ity of positive from normative theory. Derrida 
views writing as a product of previous texts; he 
states: “Whether in the order of spoken or writ- 
ten discourse, no element can function as a sign 
without referring to another element which 
itself is not simply present. This interweaving 
results in each ‘element’. .. being constituted on 
the basis of the trace within it of the other ele- 
ments of the chain or system. This interweaving, 
this textile, is the text produced only in the 
transformation of another text” (1981, p. 26). 
Derrida’s own “texts” reflect this interweaving 
and involve the close reading and re-writing of 
earlier philosophy texts. 

Jensen offers a rhetoric of revolution as a way 
to distance himself from the earlier texts (the 
normative theories) that influence his writing. 
But such distancing ignores the fact that our 
thought, theories, and research all originate with 
an understanding that we have come to through 
exposure to previous texts. As Culler (1982, p. 
134) states: 


What deconstruction proposes is not an end to distinc- 
dons, not an indeterminacy that makes meaning the in- 
vention of the reader. The play of meaning is the result of 
what Derrida calls the “play of the world,” in which the 
general text always provides further connections, corre- 
fations, and contexts. 


Jensen’s positive theory certainly takes the form 
it does as a result of his readings in “economics,” 
“finance,” “accounting,” etc. His own text must 
be considered as a weaving within this “textile,” 
this fabric of the “traces” of his understanding of 
the way the world is, not as a prophetic gesture 
which operates independently. As Felperin 
(1985, p. 31) states: 


.-- texts resemble so many Pacific islands in a vast coral 
reef of textuality, all outwardly distinct yet uncertainly 
connected with and supported by each other in an elab- 
orate submarine network. 
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It is under the water, within the negative space, 
where one cannot “see” the operation of inter- 
textuality and the traces that always root what 
one is capable of “seeing,” or “writing.” For 
“observations” can always be turned back to 
those prescriptions which lead one to observe 
this instead of that. An empirical observation be- 
gins to slip and slide on the murky bottom of this 
coral reef where it becomes obvious that such an 
observation can be interpreted (can be written) 
‘from any number of originary texts. 

In Glas, Derrida uses split writing to illustrate 
intertextuality. This involves presenting one 
text on the left side of the page and another on 
the right. The juxtaposition of the two ironically 
points to the commonalities of what are taken to 
be un-alike texts and makes salient the difficulty 
of privileging one over the other or even of dis- 
tinguishing between the two. As an example, we 
illustrate in Table 1 the intertextuality between 
Jensen and ASOBAT (1966), perhaps the most 
well-known normative text on accounting 
theory. l 
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ASOBAT begins with explicitly normative 
statements about the nature of accounting and 
its objectives. Jensen makes parallel moves, first 
linking accounting to organizational structure 
Chis own version) and then defining accounting 
as serving the organization’s need for control. 
Jensen re-writes stewardship as a “control motif” 
to better serve the contracting and conflict 

. Orientation of his normative theory of organiza- 
tions. Each text then asserts improving account- 
ing practice as its ultimate goal. For ASOBAT, 
this means meeting the four standards of rele- 
vance, verifiability, freedom from bias, and quan- 
tiflability. For Jensen, this means accepting his 
view of organizations and then pointing to evi- 
dence that supports that view. Jensen thus 
begins with the same normative grounding as 
ASOBAT, re-traces the normative moves, but dis- 
tances himself with an appeal to empirical evi- 
dence which he neither presents nor establishes 
a scientifically credible basis for. Both Jensen 
and ASOBAT tell a story about what we are doing 
and how to do it better; both are, as is all inquiry 


TABLE 1. Intertextuality: Jensen Re-Writes (Traces) ASOBAT 


ASOBAT 





In developing this statement, the committee has sought 
... to identify the field of accounting so that useful 
generalizations about it can be made and a theory 
developed (p. 1) : 

The objectives of accounting are to provide information 


for... maintaining and reporting on the custodianship 
[stewardship] of resources (p. 4). 


The purpose in developing 2 theory of accounting is to 
establish standards for judging the acceptability of 
accounting methods [policies] (p. 6). 


Four basic standards are recommended as providing 
criteria to be used in evaluating potential accounting 
information: relevance, verifiability, freedom from bias, 
and quantifiability (p. 7). 


In developing this statement, the committee has sought 
...to point out improvements in accounting practice... 


(p. 1). 


Jensen 


Accounting is an integral part of the structure ofevery 
organization. . . (p.319). 


Accountants have long recognized the importance accounting 
has played in the stewardship or control of organizations, 

and this is consistent with the notion that accounting isa 

basic part of organizational structure and that accounting 
practice and organizational form are related (p. 323). 


... policy questions are, of course, both interesting and 
important, and they are best answered with knowledge ofa 
wide range of positive theory — that is, knowledge about 
how the world behaves (p. 320). 


...afundamental understanding of why accounting practices 
evolve as they do and how to improve them requires a deeper 
understanding about organizations than now exists 

.- (p.319). 
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.into human community, normative texts. 


Move VII. Bebavior and organizations: more 
aporia and differance 

Much of Jensen’s theory depends upon con- 
tracts; contracts provide the operational 
metaphor for the kind of evidence necessary to 
support his theory. He is, however, concerned 
with organizations and views contracts as con- 
stitutive of organizations: “A major challenge fac- 
ing social scientists is the development of a body 
of theory to explain why organizations take the 
form they do and why they behave as they do” 
(p. 319, emphasis added). But he is aware that 
organizations are inanimate and do not literally 
“behave.” People behave, but people are difficult 
to aggregate into the kind of globality necessary 
to support an economic theory of organizations. 
Jensen is aware of the anthropomorphic error in 
assuming that organizations behave, as is clear in 
his criticism of microeconomic theory of the 
firm: 


Unfortunately, the vast literature of economics that falis 
under the label of “Theory of the Firm” is not a positive 
theory of the firm, but rather a theory of markets. The 
organization or firm in that theory is little more than a 
black box that behaves in a value- or profit-maximizing 
way ....In this firm there are no “people” problems or in- 
formation problems, and as a result the research based on 
this model has no implications for how organizations are 
structured or how they function internally (pp. 325— 
326). 


But the greatest limitation of the economic 
theory of the firm is, in Jensen’s view, its mislead- 
ing anthropomorphism: 


The danger in its [the firm as a black box] use arises 
because it further encourages the tendency to per- 
sonatlize organizations by attributing motives and prefer- 
ences to what is in fact a complex equilibrium system. 
Such personalization of organizations easily leads to un- 
critical application of the black box approach to ques- 
tions it cannot handle (p. 328, emphasis and insert 
added). - 


He then proposes an alternative definition of the 
firm: 


I believe it is productive to define an organization as a 
legal entity that serves as a nexus for a complex set of 


contracts (written and unwritten) among disparate indi- 
viduals (p. 326). 

This view of organizations focuses attention on the 
nature of the contractual relations among the agents who 
come together in an organization — including suppliers 
of labor, capital, raw material, riskbearing services, and 
customers (p. 326). 


This contracting view supposedly cuts through 
the anthropomorphism and gets at the real be- 
havior which occurs within the firm: 


The nexus of contracts view of organizations also helps to 
dispel the tendency to treat organizations as if they were 
persons. Organizations do not have preferences, and they 
do not choose in the conscious and rational sense that we 
attribute to people (p. 327). 


So Jensen argues that “real,” “literal,” “posi- 
tive” behavior of people (agents within the or- 
ganization) can be observed not by the black 
box of economic theory but by the observation 
of contracts (both “written” and “unwritten”, p. 
326). The point here is that neither contracts 
nor organizations literally “behave” nor are Jen- 
sen’s contracts necessarily even observable if 
they are “unwritten.” Even so, how many are 
there? What kind of literal specification do we 
need to know when we have a contract and 
when we don’t? Which ones do we attend to? 
What portion of human activity is literally under 
the auspices of contracts? What kind of behavior 
do we ignore? How do we map back to behavior 
from contracts? Who are the contracting mem- 
bers of the organization? 

To privilege contracts is just as vacuous as 
speaking of the behavior of the firm. Having 
pointed out the anthropomorphic character of 
the economic theory of the firm, Jensen simply 
supplants it with his own anthropomorphic 
“nexus of contracts” view: “The bebavior of the 
organization is the equilibrium bebavior of a 
complex contractual system made up of 
maximizing agents with diverse and conflicting 
objectives” (p. 327, emphasis added). A “nexus 
of contracts” no more “behaves” than an organi- 
zation does; nor is it more “observable,” or less 
concerned with “motives.” Nothing is really 
changed; he wants to slide in “contracts” for 


me 
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“people.” This is clear in his leap to the statement 
“the individual agent is the elementary unit of 
analysis” (p. 327) in the nexus of contracts view. 
The unit of analysis is a select set of contracts, 
certainly not people. 

Another troubling aspect of his definition is 
the equating of equilibrium behavior with or- 
ganizational behavior. Equilibrium is, in econ- 
omics if not in physical science, a static concept. 
It has no behavioral, no animate, signification. 
The Penguin Dictionary of Economics defines 
it as “A state in which forces making for change 
in opposing directions are perfectly in balance 
so that there is no tendency for change.” How 
does one come to terms with the apparent 
oxymoron “equilibrium behavior” when 
equilibrium is static and behavior is necessarily a 
process? We are aware that such use of equilib- 
rium is accepted; but it is normative — it is an act 
of faith that people will strive toward resolution 
of conflict, that they will change; it is always and 


- already both just out of reach and antithetical to 


behavior. Equilibrium is a concept unique to a 
certain kind of normative economics, a con- 
struct of normative economic modelling, a 
metaphysic antithetical to, say, a metaphysic of 
anarchy or chaos. One can “point” to evidence 
for firms as “organized anarchies’” just as easily as 
one can point to the homeostasis of equilibrium. 
Like Zeno’s paradox of motion — that motion is 
nonexistent and only interpretable by its past 
and its future — equilibrium “behavior” is not 
behavior; rather, it is a “writing” based on differ- 
ance. 

Where is the positive distinction between the 
“behavior of organizations” and “the behavior of 
contracts”? Where does economics begin and 
E[I}conomics end? The eco (the household; 
people) and the Icon (stylized models of an 
“economic” clerisy) merge indistinguishably in 
the text. The microeconomic theory of the firm 
and the nexus of contracts view are only com- 
peting clerisies. 


Move VII. Organizational survival of the 

fittest some “recent results” without results 
We can pursue ‘the  survival/extinction 

metaphor a bit further in Jensen’s text, pointing 


out how his analysis reveals nothing about the 
characteristics of survival that we might use to 
inject the required “scientific” structure to his 
theory. To scientifically motivate survival of the 
fittest, he states: 


The manner in which we use tautologies to develop 
positive theories is closely related to the nature of the 
scientific process itself. The process involves the use of 
the definitions and the underlying tautology (such as the 
survival of the fittest) and a subset of the available data on 
surviving and extinct species to develop propositions 
about the important aspects of the environment and their 
relation to traits contributing to survival (pp. 330-331. 
emphasis added). 


Thus, for good science, one must study surviving 
and extinct firms (seance may be substitutable 
for science here since we must recall the dead) 
to identify characteristic differences; to do Jen- 
sen’s science, one must also protect at all costs 
the faith in the “definitions and the underlying 
tatutology.” 

In a section subtitled “Some Recent Results 
On Control” (p. 328), a section which, by the 
way, includes neither its own nor any other em- 
pirical results, Jensen suggests that the key to 
survival is “separation of decision management 
(the initiation and implementation of decisions) 
from decision control (the ratification and 
monitoring of decisions) ...” (p.328). As good 
scientists, we might expect a discussion of well- 
designed analyses of levels of this separation as 
the treatment effect nested within a sample of 
surviving and extinct firms, with other factors 
that might influence survival either controlled 
or randomized. Yet we are not given any results 
(“recent” or otherwise); instead, we are given 
some narrative about “legal form” of operation 
and some rhetoric to suggest that legal form 
both does and does not matter. ' 

From the following sentence, we might con- 
clude both that empirical research has been 
done and that legal form is in fact the key to sur- 
vival: “Yet, even though other organizational 
forms such as proprietorships, small partner- 
ships, and closed corporations compete with 
corporations . . . the evidence is clear: in the pro- 
duction of a wide range of activities, the corpora- 
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tion continues to win the competition for survi- 
val” (p. 328). Here is another aporia: legal form 
is not supposed to matter since “separation of 
decision management and decision control” is 
the relevant hypothesis, but this is refuted by the 
“evidence” that “the corporation continues to 
win the battle for-survival.” Why does the corpo- 
ration win the survival battle, itself a contradic- 
tory assertion/conclusion since “other organiza- 
tion forms . . . compete with corporations . . . (p. 
328)? To disentangle ourselves, we would need 
a study which shows that this “separation” is 
“high” for any surviving form and “low” for any 
extinct form.' If we can conclude anything from 
the “recent results,” it is that corporations might 
be the most common form of surviving organiza- 
tion, though we don’t have a clue as to why." 

Earlier in the text we are given a clue that mar- 
ket segmentation might be the key to survival. 
Jensen discusses organizational forms in retail- 
ing like shopping centers, large multi-product 
department stores and independently-owned 
speciality stores: 


. - . İt is easy to see bow comparisons of such organization 
forms lead to questions regarding the factors that give 
competitive advantages to each of these three organiza- 
tional types (shopping centers, departments stores, and 
independently owned specialty stores) at various times 
and at various locations. Such questions are relevant be- 
cause we know all three types of organizations con- 
tinue to compete and survive (p. 327, emphasis added). 


What we conclude from such “observations” is 
that different organizational forms exist in retail- 
ing and successfully compete with one another. 
Observing different forms of retail operations 
suggests that factors such as different marketing 
strategies affect survival more readily than the 
factor “separation of decision management and 
controL” In any case, we are again left without 
any evidence about surviving and extinct forms 
or about separation of decision management and 





control and should wonder why we are reading 
if we are good positivists. 


Move IX. Yet more supplements: my science 
and everybody else’s 

Jensen’s text moves throughout with appeals 
to the status of science to distance itself from 
other (presumably) nonscientific theories. The 
words “science”, “scientific” or “scientists” are 
used 31 times. But the text's failure to adhere to 
its own notion of science appears when a plea is 
made (pp. 332—333) for a separate standard of 
evidence, a different epistemology, to validate 
the knowledge claims of positive theory. Jensen 
discusses the institutional and qualitative nature 
of the evidence, the difficulty of getting 
evidence at all, and the resultant problem ofhow 
to generalize from such evidence. For example: 
“... Many important predictions of the research 
on positive organization theory and positive 
accounting theory will be characterizations of 
the contracting relations, and much of the best 
evidence on these propositions will be qualita- 
tive and institutional evidence ...” (p. 332); 
further, “Not all institutional evidence is readily 
available . . .” (p. 333); and finally, “By its nature, 
much of this institutional evidence cannot be 
summarized by measures using real numbers. 
We-simply do not know how to aggregate such 
evidence ... Statisticians and econometricians 
are likely to react because it violates a long and 
venerable tradition of formal testing” (p. 332). 

This appeal for an extra-science evidence 
standard is a supplement. If positive theory is 
good science, then it should be judged by the 
criteria of good science. While grounding posi- 
tive theory in science, Jensen wants different 
grounds, a supplement, to prop-up the evidence, 
to hasten its acceptance, to quicken the “revolu- 
tion in organization theory.” That this appeal for 
a scientific theory to excuse itself from science is 
even made suggests that the theory may not be 


13 What are these “extinct” species? Greek city-states are alive and well in New England villages, as are medieval guilds in trade 
unions. The Church is and always has been an economic institution, perhaps the oldest surviving form and therefore the one 


we should be studying. 


4 We also suspect that identifying corporations as the most common form of organization is questionabie and dependent 


upon the selection of specific metrics. . 


LS a 
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‘particularly persuasive when compared to 


others.” Kaplan (1983, p. 344) makes a similar 
point in commenting on Jensen’s paper. Jensen 
even suggests that criticisms of the evidence are 
counterproductive: “The practice of using 
pejorative labels such as ‘casual,’ ‘anecdotal,’ or 
‘ad hoc’ to describe such institutional or qualita- 
tive evidence is counterproductive to the re- 
search process” (p. 333). Counterproductive, 
that is, to Jensen’s desire to persuade us of the 
correctness of positive theory by rhetorical ap- 
peal to science without actually adhering to 
scientific canons. 

Ironically, it is the same sort of pejorative 
rhetoric that positive theorists in accounting 
have used to privilege their own work over 
others: 


Prescriptions in the accounting literature are based on 


hypotheses about observed phenomena in capital mar-. 


` kets, political process, and other areas. Rarely do any of 
the prescribers suggest tbat tbe bypotheses be tested for- 
mally, let alone perform such tests. Moreover, the 
hypotheses are often inconsistent with currently ac- 
cepted theories in finance and economics ( Watts, 1977, 
p. 54, emphasis added). ` 
Undoubtedly there are alternative theories which can 
explain the timing of the accounting literature. The chal- 
lenge is to those who would support those alternative 
theortes to specify them and show that they are more 
consistent with the evidence than ours (Watts & Zim- 
merman, 1979, p. 300, emphasis added). 
Accounting theorists became more concerned with pol- 
icy recommendations, they became more normative — 
concerned with what should be done. Very little concern 
was exbibited for tbe empirical validity of the bypotb- 
eses on which the normative prescriptions rested (Watts 
& Zimmerman, 1986, pp. 4—5, emphasis added). 





We are sympathetic with Jensen’s point about 
qualitative and institutional evidence. He says 
“it is unwise to ignore important institutional 
evidence while paying great attention to unim- 
portant quantitative evidence...” (p. 333). This 
seems to be an indictment of scientific reduc- 
tionism for its exclusion of contextuality. In fact, 
it sounds rather like a Marxist appeal to social 
milieu or a Foucauldian appeal to genealogy. Yet 
in Jetisen’s hands, it seems little more than a 
rhetorical ploy to bave one’s cake (the privilege 
of science ) and eat it too (without going through 
the “rigor” of scientific practice). The sort of 
“science” Jensen preaches more closely resem- 
bles Feyerabend’s (1975) dictum of “anything 
goes” than the law and order of rational scientific 
inquiry to which he appeals. The extra-science 
appeal subverts the rhetoric of Science which he 
uses to ground and privilege positive theory 
over the presumably nonscientific theories he is 
in competition with. 

Our reading of Jensen’s text is just a smatter- 
ing of the possibilities for calling into question. 
claims to privilege in the text. Jensen’s prescrip- 
tions about “good” science are the basis for its 
deconstruction, and, not surprisingly, the text 
falls far short of its own mark.'* There is nothing 
unique about the relative ease with which Jen- 
sen’s text is deconstructed, any claim to founda- 
tional author[ity] is subject to deconstruction.'” 
We suspect that the rhetoric of positive theory is 
motivated by academic politics. Nick Dopuch 
has pointed out (see Kaplan, 1983, p. 343) that 
the positive accounting research tradition de- 
fined by Jensen as originating in the mid-1970s is 


15 A good example of subjecting the “evidence” of Jensen's colleagues Watts & Zimmerman to the rigours of science appears 


in McKee et al.'s (1984) reworking of that evidence. 


16 Our use of aporia, differance, supplement and trace by no means exhausts Derrida’s mode of critical reading. Other 


Derridean “inventions” include the spur, writing under erasure, Pharmarkon and hymen. 


17 The texts of deconstruction are “deconstructible” even though they are not objectivist and foundational. We may not like 
the “logocentric tradition” which has always ridiculed rhetoric, but we are all products of it. This is the Derridean double- 
bind, the Nietzschean resolve to escape the bounds of the logocentric tradition is encased within that tradition. As Melville 


(1986, p. 3) states, Derrida’s writing style makes this clear: 


Derrida is thus double-bound, his words necessarily at war with themselves, struggling to exempt themselves from the very 
grammar in which they are caught up and by which they mean. Inevitably they are qualified, hedged with quotation marks, 
“written under erasure,” submitted to what a late essay on psychoanalysis refers to as “ploys of designification.” 
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no different from the empirical tradition which 
‘began in the mid-1960s. However, the Univer- 
sity of Chicago is generally viewed as the origi- 
nary institution for the empirical tradition. In 
order to distance itself from Chicago, The Uni- 
versity of Rochester used the rhetoric of positive 
theory to create “property rights” to a new term 
for an old movement and created a journal, 
Accounting and Economics, to promote the 
cause. Thus, arguably, considerable institutional 
power shifted to Rochester. 

The Rochester and Chicago movements are 
similar in that both exploit the positive/norma- 
tive dichotomy for their power. Given that twen- 
tieth-century thought about science makes 
claims to non-normativeness risible, our own 
view is that this shift in accounting research in 
the mid-1960s was largely a technological shift; 

in short, it was the birth of our modernism. With 
the advent of CRSP tapes and other data bases 
and with the injection of statistical foundations 
to doctoral education in accounting, the “empir- 
ical” era ushered in a new style of writing, not a 
Superior way of knowing. Thus what distin- 
$. guishes accounting authors, pre- and post- 
- ‘empirical, is not “values/facts, “normative/posi- 
tive”, “truth/opinion’”; but, rather, technology or 
writing style. '® 


DECONSTRUCTION, THE POLITICS OF 
DISCOURSE AND ACCOUNTING “TEXTS” 


In Tbe Post Card, Derrida focuses upon sub- 





1'8 In a similar manner McCloskey (1984, p. 850) says: 


version of [phal ]logocentric discourse through a 
series of love letters which speak allegorically to 
knowledge production through a language of 
seduction rather than a language of rational dis- 
course. The letters are written to a voiceless 
lover and dispatched through various postal net- 
works as Derrida proceeds from university to 
university delivering lectures.’? We can extract 
from The Post Card an issue directly relevant to 
this paper, Derrida’s play upon the metaphors of 
“letter” (research paper), “poste” (guardian) 
and “post office” (the networking of research 
distribution). Here we envision production and 
reproduction of knowledge within a discipline 
governed by a transportation model of rules 
about what can be said, how and where it can be 
distributed, and what taxes are imposed upon 
the passage. This postal metaphor serves to rep- 


‘resent all of the sociological, political, historical 


and linguistic contextuality within which 
knowledge production takes place. These are 
the issues that Foucauldian scholars in account- 
ing are trying to unpack (see Hopwood, 1987; 
Miller & O’Leary, 1987; Loft, 1986) in order to 
subvert the notion that accounting is always and 
only a techne of progress. But for our purposes, 
the postal metaphor means that research pro- 
duction is not the ascetic, democratic process of 
subjecting the quality of writings to a fair game 
of dissemination; but, rather, it is an administra- 
tion of power and control over what makes it 
through the mail and what doesn’t. From the 
work of Kuhn (1970), Overington (1977), 
Gouuldner (1979), Broad & Wade (1982) and 


If I may wax wroth a little, what in heaven’s name is “critical” or “profound” about these issues? It is significant that they 
can be better expressed with portentous capitals. What is Science? What is True? The point is that such questions — the 
putative subject of methodology — are a waste of time. We can and do know what is true, or civil, without any inquiry into 
what is Truc, capital T, according to an approved Methodology, capital M. The project of demarcating Science from non- 
Science, or Truth from truth, lacks point 


19 An important point here is the distancing from this “Other” and the absence of the voice of the other which cannot surface 
within the stifling discourse of the universities, themselves a metonym for the logocentrism of Western metaphysics. The play 
around this question of the excluded voice, a theme continued in Glas and other works, makes Derrida’s writings appear 
obscuranist and insane to some, but, to others, the seductive genre of these writings is an attempt to search for a language 
which at least in a provisional sense circumvents the centering of orthodox Western discourse, a language which at least 
gestures toward an as yet unimagined centering of marginality (see Tyler, 1987, for an attempt to start such a postmodern 
conversation). Perhaps we can imagine a similar discourse of “accounting for the unaccountable”, not within the structure 
of accounting yet engulfing it. 
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others we know that the production ofacademic 
research is intensely policed and administered 
by hegemonic elites. In short, the rules of the 
academic poste are concerned with protection, 
discipline, and punishment. 

What this means for the production of knowl- 
edge is that, in Derrida’s terms, “the post is no 
longer a simple metaphor, and is even, as the site 
of all transferences and all correspondences, the 
‘proper’ possibility of every possible rhetoric” 
(1987, p. 65). Very simply, what counts as 
knowledge (or acceptable discourses) always 
is filtered through a sociological network of en- 


dorsement, stamping (taxing) and perhaps. 


arrested through dead lettering. The process is 
less one of efficient networking between sender 
and receiver than it is an act of insuring com- 
pliance with the authority, with the rules of ac- 
ceptable codes. 

In US. accounting, the “positive-empirical” 
code has the largest postal network, with minor 


' postal services operating on a much smaller 


scale. It is our belief that the poste of accounting 
knowledge has become dangerously centered 
upon this technology of naive empiricism. 


Further, the dominance of this view cannot be 


explained in terms of the intellectual compe- 
tence of the arguments which it uses to privilege 
itself, and the deconstructive exegesis of those 
arguments bears this out. Instead, this view sur- 
vives because of the political economy of ac- 
counting research. It has always been in the in- 
terest of those with the most wealth and power 
to make appeals to “the market” as the arbitrator 
of “quality,” and we suspect that this is the argu- 


' ment that will be used against the more sinister 


view of the poste presented in this paper. Along 
with the Foucauldian critics, we would join in 
questioning the uncritical acceptance of “the 
market” for knowledge without investigating the 
extent to which the archaeological structure of 
that market can be described as egalitarian and 
fair. 

If the production of knowledge proceeds 
through postal rules which are biased, then seri- 
ous Consequences may emerge for the nature of 
knowledge generally. In fact, this is the key to 
Derrida’s battle against the constraints of West- 


ern metaphysics. We argue that the combination 
of accounting research being prescribed as a 
positive science and the absence of intellectual 
credibility for this view is sufficient evidence for 
concluding that “the market” for accounting 
research is governed by factors other than the 
quality of contributions to knowledge. 

For Derrida, the real tragedy is that claims to 
privilege lead other voices (other research per- 
spectives) into the dead letter box. It is this 
“dead lettering” that stops conversation, that 
censors, that keeps us always in the transmission 
of right answers. Ulmer (1981, p. 51) points out 
the need to overcome the desire of professors to 
conclude, to render a question inert through 
resolution, to reduce the tension ofa problem or 
an interpretation to the nirvana state of zero 
pressure by designing a decided meaning. 

We do, however, have an ethical responsibil- 
ity to the public because of the honorific posi- 
tion that we occupy as professors. Our unearned 
claim to privilege as “value-free” positive scien- 
tists violates this responsibility. Just as L A. 
Richards pointed out that awareness of 
metaphor serves “to protect our natural skill 
from the interference of unnecessarily crude 
views about it” (1936, p. 116), Derrida points 
out that the privilege of academic discourse “is 
as if a catastrophe had perverted this truth of 
nature: a writing made to manifest, serve and 
preserve knowledge — for custody of meaning, 
the repository of learning, and the laying out of 
the archive — encrypts itself, becoming secret 
and reserved, diverted from common usage, 
esoteric ...a writing becomes the instrument of 
an abusive power, of a caste of ‘intellectuals’ that 
is thus ensuring hegemony, whether its own or 
that of special interests ...” (1979, p. 24). As 
Ulmer (1981, p. 53) explains, “The catastrophe 
is the second veiling which covered over the 
first, which made the secret secret, causing 
people to forget the original encrypting and 
accept the power of the priests as natural.” One 
can espouse “positive theory, “value-freeness” 
and “scientificity” only behind the first veiling of 


- the myth of metaphysical presence, the myth 


that one is writing about some-THING beyond 
one’s writing. The second veiling is the forget- 
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ting of the myth which is what gives s¢ientism in 
accounting such author{ity] to speak of “truth.” 
Deconstruction, if it were to decenter the ob- 
jectivist foundationalism in accounting, would 
be invaluable’as an act of liberation. But if we are 
ultimately unable to escape from logocentrism 
and grounding, as Derrida has suggested, then 


[con }texts will always exist for deconstruction. -- 


However, deconstruction is about language and 
offers much more than a strategy for the critical 
reading, re-writing and decentering of account- 
ing research texts and the opening up of 
academic discourse. Accounting itself, account- 
ing practice, is a way of writing a certain kind of 
economic text about organizations, about 
“organizations” of economic meaning (see also 
Hoskin & Macve, 1986). The meaning cir- 
cumscribed within these acounting texts is pre- 
sented as facticity and as transcendentally sig- 
nified beyond the text itself. The metaphor “the 
bottom line” speaks to the grounding of account- 
ing in a metaphysics of presence and facticity, 
and usage of the metaphor beyond its account- 
ing context reveals just how firm the grounding 
is. Accounting, then, produces an economic cen- 
tering and privileges a particular type of econ- 
omic visibility and calculation in the writing of 
texts about the organization (Burchell et al., 
1980). 

The powers of accounting to discipline, to 
punish, to legitimate, are themes that have 
emerged in recent critical discourses about 
accounting. Its power exists, at least in part, 
because accounting writes itself as transcenden- 
tally signified, as originating in “nature.” That 
presence veils the constructivist origins of ac- 
counting and self-legitimates the legitimacy of 
accounting in a-grounding of facticity (Lyotard 
1984, 1985 ). To deconstruct these texts, then, is 
to dislocate accounting’s arbitrary power to cir- 
cumscribe and close off the further possibilities 
of economic meaning about the organization. A 
deconstructive reading of accounting itself is 

` beyond our purpose in this paper. But two of the 
Derridean themes one might explore would be 
the production of meaning through-binary op- 
positions and the play of differences [e.g. debit/ 


credit (see Hoskin & Macve, 1986, p. 119), asset/ 


liability, revenue/expense], and the marginaliza- 
tion and exclusion of what is not present, the ac- 
counting and the not accounting. 

Finally, deconstruction offers a pedagogy for 
the critical reading of texts. Modernist teaching 
of classic research papers follows Norris’ (1983, 
pp. 555-556) description: 


It tends to be assumed, once a text achieves a canonical 
status, that the business of commentary is to seck out 
coherence and intelligibility, to justify the text on its own 
argumentative terms. 


Deconstruction, of course, does exactly the op- 
posite and calls into question the “coherence 
and intelligibility” of a text precisely on its own 
argumentative terms. Our own classroom 
experiences with deconstruction have been - 
encouraging. Students who have spent years of 
“learning” the authority of logocentric truths in 
texts appreciate the dissemination, subversion 
and play of meaning opened up by deconstruc- 
tive readings. The best hope for decentering 
modernism lies with the novice and uninitiated 
scholars — our graduate students. Modernists 
are not likely to embrace the indeterminacy of 
postmodern thought (see Kuhn, 1970, pp. 150— 
152‘for an elaboration in his framework of nor- 
mal/extraordinary science). From this perspect- 
ive, teaching is critical and deconstruction is a 
potent reading strategy to facilitate the pedagog- 
ical move from modernism to the postmodern 
moment. 


CONCLUSION 


This paper has tried to do several things. First, 
it introduced the writings of Derrida and used 
deconstruction as a means to reflect upon the 
claims to privilege by positive theorists. The 
point here is not destruction, not the dumb- 
witted response of those who would shout ir- 
rationalism or irresponsibility. Nor is the point 
to replace positive theory with another 
privileged one. Rather, deconstruction wishes to 
operate on the pretexts of texts in order to dis- 
mantle their pretensions. Traditional critiques 
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_are based on attacks from “outside,” from a diffe- 

rent view of Truth. These critiques are Blitz- 
kriegs, bombs dropped to explode one clerisy 
and replace it with another. Derrida’s critiques 
work from the inside, they bore from within, 
they implode claims to privilege. 

Second, this paper sought to point out that 
positive theory’s claim to privilege is based on 
the worst kind of sophistry — the kind that 
denies its own rhetoric. Jensen’s claims have no 
validity when considered in light of even the 
most cursory understanding of the nature of 


science. Even at that, Jensen’s text falls far short’ 


of the mark of his own representation of science. 
The real danger of positive theory is that we ap- 
pear to have taken it seriously as a superior way 
of knowing. It is entirely possible that the kind of 
libertarian economics Jensen has in mind is find- 
ing its way into public policy by persuading 
others that it is based on objective, value-free 
science.” 

_ Third, the paper seeks a certain kind of humil- 
ity in scholarship, a humility which causes us to 
recoil at even the suggestion that we may be in 
the best position to tell people what is in their 
best interest. The quality of human life is much 
too rich and the nature of our relationships with 


each other much too complex for any system of 
thought to dictate from a position of privilege. In 
fact, the Foucauldian scholarship surfacing in 
accounting is designed to illustrate that such 
complexity makes impossible any claim to speak 
descriptively about human relationships in such 
a totalizing sense. 

Most importantly, the paper wants to expand 
the conversational space in accounting research 
by removing the privilege attached to certain 
kinds of writing technologies. For example, 
questions of ethics, fairness and materiality are 
central to accounting as is the question of the re- 
lationship between life and labor. These ques- 
tions do not wash very well through modernistic 
privileging of Method.”' They are best addressed 
through the genre of essays. No amount of data 
or evidence can dictate the conditions of, say, 
fairness to humans. We must become the criti- 
cal, self-reflective, humble peddlars of questions 
rather than answers. Deconstruction is much 
more interested in the questions, in opening up 
rather than closing off discourse. We might even 
attend the post office, observe the dead letter 
box, and release all kinds of post cards which are 
both undelivered and capable of contributin 
much. ; 
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BUDGET IN A COLD CLIMATE* 


BARBARA CZARNIAWSKA-JOERGES and BENGT JACOBSSON 
Economic Research Institute, Stockbolm School of Economics 


Abstract 


This paper attempts to trace connections between budget processes taking place in concrete organizations 
_and the cultural context in which the organizations are located. The examples are taken mainly from studies 
of the Swedish public sector. The research perspective adopted depicts budgeting as a symbolic perform- 
ance rather than a decision-making process; a means of conversation rather than a means of control; and an 
expression of values rather than an instrument for action. From this point of view, linking budgeting to a 
cultural context means looking at which symbols, what language and which values are represented in par- ` 
ticular budget processes, Budget processes are seen as 2 ritual of reason, reflecting the high value which is 
attached to rationality in Sweden in general and in the public sector in particular. We also claim that budget- 
ing —-a language of numbers — is also a language of consensus, which permits the handling of potential con- 
flicts without confrontation. Finally, we analyze the recurrent changes of dominant budget forms as being 
congruent with the culture of “reformism” which seems to typify the area in which we are interested. 


Budgeting ‘is often looked upon as a process 
where rational choice takes place, and budgets 
are seen as instruments for planning, coordina- 
tion and control (Anthony et al, 1972). The 
machine metaphor comes into mind; given cer- 
tain rules and data, good budgeting ensures that 

’ the best decision will emerge from the process. 
Other metaphors have also been used, however. 

` Wildavsky (1974), for instance, uses metaphors 
of gaming and combat, indicating the political 
character of budgeting. 

Budgeting is perceived as important for or- 
ganizational outcomes. A popular text-book 
used in Swedish universities is titled, somewhat 

` pretentiously, How Is Sweden Ruled? Central 
Administration and the State Budget. Half of the 
book is dedicated to descriptions of budgetary 
_ processes in Swedish state agencies and minis- 
tries. This is a good illustration of a widely held 
belief according to which the budget process is 
a very important — sometimes even the most 
important — 





political process. The budget is, 
considered a significant instrument of resource, 


allocation as well as of control in the public sec- 
tor in Sweden. However, it is difficult to maintain 
this dominant view when confronted with the 
real world. 

Some apparent paradoxes have rady been 
pointed out in several empirical studies to which 
we shall return. Although budgets are consi- 
dered to be important control instruments, it 
seems to be extremely difficult to exercise con- 
trol through budgets. Although budgetary pro- 
cesses are perceived as processes through which 
important decisions are made, decisions often 
are initiated and pondered over in other arenas. 
Although budgeting represents effectiveness 
and efficiency, the relation between budgeting 
and efficiency is highly ambiguous. We do not 
claim that budgeting has nothing to do with 
rationalistic decision-making. or struggles for 
political power, but it is very difficult to inter- 
pret budgetary behaviour solely with these 
frames or metaphors in mind. 

The instrumentality of budgeting has been 
challenged and more symbolic aspects have 


*The first version of this paper was presented at the Workshop on Accounting and Culture organized by the European Insti- 
tute for Advanced Studies in Management, Brussels, 9-11 December 1987. 
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been stressed. This has had limited effects on ac- 
tual budgetary processes or on budgetary re- 
form proposals, however. The general reaction 
towards the paradoxes has been one of en- 
hanced idealism: maybe the world is not as per- 
féct as our models. .. yet! And proposals for im- 
proving the instrumental functions of budgeting 
continue to emerge. 

Our stance in this paper is not one of idealism, 
but that of realism. We would like to know why 
actors stubbornly stick to the instrumental per- 


spective on budgeting, and we are convinced — 


that the answers have something to tell us about 
the cultural context in which these processes 
are embedded. 

We depart from a very commonsensical pre- 
mise: that every social process, and therefore 
every organizational process, is embedded in 
what can be called a cultural context of organiz- 
ing (Czarniawska, 1986): a network of wider 
processes and more stable systems of values and 
beliefs, themselves shaped by history, geopoli- 
tics, economic situation, art.’ In this sense, 
budgeting as an organizational process is both 
shaped by the context in which it takes place and 
contributes to the re-creation of this context by 
reproducing its main values and customs. 

Hence, our focus is primarily on symbolic per- 
formances of budgets, not their part in decision- 
making processes. We look at budgeting as a way 
to communicate rather than a way to control. 
We focus on budgeting as an expression of val- 
ues, not as'a means of achieving coordinated 
action. We attempt to link budgeting in organiza- 
tions to the cultural context. How do the organi- 
zations present themselves to their own mem- 
bers and to the environment? Which symbols do 
they transmit? What language do they use; and 
which values are represented in particular 
budget reforms and procedures? ; 

_ While analyzing these issues, we traced, sepa- 
rately or at the same time, at least three relevant 





aspects of the cultural context in which budget- 
ary processes take place: the public administra- 
tion context (as compared, for example, to a 
business context), the Scandinavian (or 
Swedish) cultural context, and Western 
rationalism, the most general of the three. 

The pattern that will come out of the analysis 
is not an unfamiliar one. Budgeting is seen as a 
ritual of reason; budgets are presented accord- 
ing to and conforming with prevailing norms of 
rationality. Budgeting is also a language of con- 
sensus; there are several mechanisms in budget- 
ary processes for reducing the level and amount 
of conflict. Budgeting is a way of emphasizing a 
society that is gradually changing for the better; 
budgetary reforms are supplied and carried out 
in the administrative agencies with an interval of 
8—10 years. We interpret budgeting as a way of | 
expressing and enforcing some dominant values 
in the typical context of Western contemporary 
organizations: rationality, consensus and pro- 
gress. 


BUDGETING AS A RITUAL OF REASON 


Budgeting may be better understood if we do 
not take for granted the assumption that budget- 
ing has something to do with the allocation of 
scarce resources. In a study of a Norwegian 
municipality, Olsen (1970) analyzed the budget 
process as a ritual or ceremony. The significance 
of the process was not that values and demands 
were transformed into activities, as the instru- 
mental perspective would predict, but that the 
process itself strengthened certain values and 
ideas: 


Since we so easily accept ... that budgeting has some- 
thing to do with decision making, future directed action, 
coordination, goal-directed activity, etc., budgeting will 
carry strong meanings in our culture. The word will have 
a positive loaded, expressive potential, and it should not 


We perceive the fruitfulness of such a fuzzy-edged concept 2s “cultural context” as greater than that of “national culture”. 
We all use terms like “Swedish culture” habitually, but it would be a bad habit to take such a concept literally in research. We 
do not know where Swedish culture ends and Finnish culture begins. All we can do is to recognize traces of relationships that 
seem to be stable within some cultural context (which is usually, most easily identified by a nation-state, but not always so), 
and then carefully follow these relationships to see where they go and where they vanish. 
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surprise us if an activity called budgeting produces acer- 
tain level of acceptance, legitimacy and reverence 
(Olsen, 1970, p.87). 


Olson’s diagnosis emphasized that budgeting 
can therefore be seen as a ritualistic act venerat- 
ing reason. 

A study of the political control of administra- 
tive organizations in six policy sectors in Swe- 
den also examined the role of budgets (Jacob- 
sen, 1984). Important changes in these sectors 
were analyzed by tracing back who had taken 
the initiatives, which arenas were used, which 
actions were undertaken, etc. Despite the fact 
that when actors were questioned about the 
arenas which they considered to be important, 
they persistently stressed the budget process, 
the analysis of the actual changes provided quite 
a different picture. , 

The budget process was not the arena where 
important changes were considered. Wil- 
davsky’s thesis that “the budget lies at the heart 
of the political process” did not receive any sup- 
port (Wildavsky, 1974). Major alterations were 
invariably considered outside the budget. The 
expansion of resources for the National Board of 
Occupational Safety and Health was, for ex- 
ample, a result of considerations both in the 
Work Security Commission and in a work group 
in which the Swedish Confederation of Trade 
Unions (LO), the Social Democratic Party and 
the Ministry of Health and Social Affairs partici- 
pated. These organizations had no connection 
with discussions in the budgetery process. 
Another example: the Industrial Board failed to 
obtain money and personnel due to the fact that 
the appropriate Ministry chose to carry out their 
investigations independently, thus bypassing the 
Board. This affected monetary allocations, but 
the decision was not made within the budgetary 
process. Decisions were registered in the budget 
documents as one aspect of their function as all- 
encompassing contracts, but this usually only 
meant that decisions made elsewhere were con- 
firmed. Rather than being an important 
battlefield, the budgetary process reflected 
changes discussed elsewhere. 

Central agencies also enjoyed a considerable 


freedom of action concerning both the volume 
and the substance of their activities. Generally 
speaking, Ministries did not exercise control, 
through budgets (which does not mean that 
they did not apply control by other means). 
Agencies acted within rather wide frameworks 


-for conduct. To a large extent, they were also 


able to secure money from alternative sources, | 
above all by using some of the grants-in-aid 
which were administered by agencies, for their 
own purposes. The National Board of Occupa- 
tional Safety and Health, for example, decided 
how to allocate new personnel inside the 
agency, neglecting instructions from the Minis- 
try. 

Even when Ministries tried to realize their 
control ambitions through budget documents, 
they did not succeed. In spite of the Ministry of 
Labour’s refusal to establish an information unit 
at the Immigration Board, the agency de facto 
created such a unit. Alternative sources of 
money and a generally expansive climate per- 
mitted the agencies to act on their own 
(Jacobsson, 1984). 

Nevertheless, large efforts were devoted to 
writing budget proposals and supporting docu- 
ments. Budgeting was considered to be ex- 
tremely important, even among those who 
realized that important changes were discussed 
and decided in other arenas. One explanation is 
that such people thought that the budget pro- 
cess should be more important than it was: they 
were guided by a normative vision. Applying the 
cultural perspective, one may ask further ques- 
tions. Budgeting may not only be important as an 
arena where values and resources are trans- 
formed into activities, but also as an arena where 
important values are expressed. 

The production of information in the budget- 
ary process was also a way of demonstrating — 
to oneself and to others — that the activities 
were carried out in a proper way. The existence 
of calculations in accordance with rational 
criteria shows in itself that intelligent decisions 
are being made. This is in line with a common 
observation that organizations collect informa- 
tion which is not used, that reports which are de- 
manded are not read and that information is 
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'asked for after actions have been carried out 
(Feldman & March, 1981). This was certainty 
true for the budget processes. Although those 
who made the decisions about appropriations 
lacked both time and knowledge to use the infor- 
mation that they demanded, they continued to 
demand more of the same. 

The Ministry of Finance, in collaboration with 


the National Audit Bureau, had issued a manual 


for those who had to prepare budgets for the 
agencies. The instructions in the manual were 
very rationalistic and stressed that budget docu- 
_ ments had to include long-range forecasts, goals, 
alternative ways of achieving goals, the costs and 
effects of different alternatives, rationalization 
alternatives (on different levels), etc. The in- 
structions were clearly consistent with argu- 
ments in business textbooks on budgeting, but 
they had nothing to do with the information that 
the Ministry of Finance and other Ministries in 
fact thought could be used in the budget pro- 
cess. 

Since the Ministry of Finance demanded infor- 
mation that was never used, a great deal of seem- 
ingly futile work with different kinds of budget 
documents was carried out inside the state agen- 
cies (Jacobsson, 1980). At the very least, it was 
difficult to understand the activities from an in- 
strumental point of view. We come back to 
Olsen’s observations of budgeting as a process 
which carries strong meanings in the public sec- 
tor culture. It is reasonable to start with goals, to 
consider alternatives, to discuss the costs and 
the revenues of different alternatives. Budget re- 
ports then become instruments for reflecting 
these dominant, rationalistic ideas. 

Budgeting with its aura of rational goal- 
oriented action can be viewed as a formal proce- 
dure or a facade that public administration or- 
ganizations employ in order to preserve or in- 
crease their legitimacy. What really goes on in 
the budgetary processes differs substantially 
from the ideas ‘reflected in budget instructions 

‘and documents, but as budgeting is considered 





to be the rational way of exercising control, 
especially in times of stagnation, efforts are made 
to strengthen its role. Producing budgets each 
year is a way of signaling that organizations act 
rationally. 

Even if much of the information compiled is 
not used, organizations are on the basis 
of their ability to live up to the rationalistic 
ideals, This is more often the case in public or- 
ganizations, where the results of the activities 
are often ambiguous and difficult to measure. 
The National Audit Bureau, for example, 
examines how the boards succeed in following 
the instructions, and a lot of prestige is at stake in 
these assessments. The agencies have also been 
encouraged to improve their economic and ad- 
ministrative capabilities, which means that they 
hire persons who have the skills required for 
producing budgets in the way that budgets shall 
be produced. 

To conclude: budget processes very seldom 
seem to be significant arenas where both major 
and not so very major policy changes are out- 
lined. Nevertheless, demands. for, compilations 
and presentations of budget documents are car- 
ried out to give the impression that goal- 
directed and rationalistic actions take place. We 
perceive this as ritualistic behaviour; budgets 
are rituals of reason. 

There is no doubt about the fact that reason is 
one of the main values of the contemporary 
Western culture. However, the rituals differ. 
Rituals fulfill their role when they communicate 
and confirm an important value without much 
disturbance to actual activities: on the contrary, 
their functional utility is very clear (Leach, 
1968). When they become a real nuisance, they 
might be discarded; other, more comfortable, 
rituals can be used instead. Companies, for ex- 
ample, are too impatient to devote so much em- 
phasis to budgets and are unwilling to waste 
time on budgeting.” Accounting is their ritual of 
reason, and Annual Reports their icons. Public 
administration, with its political slant, tends to 


tin a study of American retail companies (Czarniawska, 1985a), even where budgeting was a common practice, General Man- 
agers emphasized their secondary role: it was much more important to achieve's one’s result than to hold to one’s budget. 
The same phenomenon was observed in Swedish companies (Czarniawska-Joerges, 1988). 
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lean towards the future. Business organizations 
see the reason best when they count their 
money. Different worlds, different customs. 


BUDGETING AS A LANGUAGE OF CONSENSUS 


There seems to be a far-reaching agreement 
among both Swedish and non-Swedish authors, 
concerning one specific element of the Swedish 
culture. It is commonly presented as a conflict- 
avoiding, consensus-seeking culture (Hofstede, 
1980; Daun, 1984; Czarniawska & Wolff, 1986; 
Hirdman, 1987). However, we would like to pla- 
cate those who would — rightly — protest at 
this bucolic description of our cold country by, 
stressing that it must not be read as a picture of 


a conflict-free culture. To be more precise, con-: 


flicts are not considered to be enjoyable; con- 
frontations are avoided; consensus is seen as a 
desirable outcome, even if it is understood as 
“constrained conflict” (Gustafsson, 1986). 

Specifying the argument concerning confron- 
tation avoidance, Daun (1984) and Czarniawska 
& Wolff (1986) suggest that Swedish culture, as 
compared to, for instance, French, views the 
spoken word as “heavyweight”. Spoken, and for 
that matter written, words can easy lead to an 
open conflict if not properly thought through. 
‘The best everyday tactic is to limit verbal utter- 
ances to a necessary minimum, and then care- 
fully estimate the weight of their contents and 
form. 

Against this background, the language of num- 
bers can be seen as an attractive alternative to 
the verbal language. It serves the purpose of or- 
ganizational communication and diminishes the 
risk of unintended conflict as a result of impre- 
cise verbalization (see also Ashton & Bizzell 
1975; Mellemvik et al., 1987). Numbers do not 
carry emotions; if anything, they help to hide 
them. There is no danger that a stray number will 
appear among orderly rows and columns; after 
all, they have to add up. It is the language of rea- 
son and relevance. Additionally, if we take 
budgets as utterances, they are not as binding as 
verbal promises are. Budgets are, by definition, 
proposals and it is generally known that reality 
tends to interfere with their fulfilment. Verbal 


utterances can also be proposals, but then a great 
many interpretations can be offered, increasing 
the possibility of conflicts. 

Therefore one could expect budgets to be 
used as a means of communication in potential 
or actual areas of conflict. We found three such 
areas. 

Brunsson & Rombach (1982) analyzed 
budget processes in Swedish municipalities 
under stagnation. Local governments in Sweden 
are accustomed to act in conditions of general 
affluence, continuous expansion and a steady 
growth. Due both to the state budget deficit and 
a changed political situation (a change after al- 
most 40 years of Social Democratic govern- 
ment), the growth has stopped. One could argue 
that expansion has not, and that municipalities 
are now richer than ever (Brunsson, 1986), but 


the situation has changed from -one charac- 


terized by a 4—9% annual expenditure increase 
to a situation of almost zero increase. If this is not 
an austerity situation, there is certainly an 
austerity mentality. 

This perception produced a new actor in the 
budget drama: a hamster, who joined the tradi- 
tional two old-stagers, the guardians and the ad- 
vocates. What was previously a conflict-prone 
situation — an adversity between guardians and 
advocates — acquired a new conflict potential. 
But hamsters are — just as guardians have more 
or less successfully always tried to be — those 
who speak numbers only. If guardians and advo- 
cates presumably had to mention values and 
goals to support their positions, hamsters have 
no such problem. Their concern is not the el- 
derly or the state of the roads, but the sound 
financial position of the municipality. This is a 
value that requires no other argument but num- 
bers. Therefore budget processes gave the 
hamsters the upper hand. They have dominated 
the discussion. The smooth language of numbers 
has won. 

Numbers in budget documents serve not only 
to fight bloodless wars but also to hide potential 
battlefields. Mechanisms of incrementalism are 
founded on the idea that only a few conflicts can 
be handled each year in the budget processes. 
Most activities are not questioned, they con- - 
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tinue as before. 


Numbers may also play an important role.in a: 


political debate. Even if it is difficult to express 
differences of opinion regarding service quality 
and other outputs (due to the ambiguities in- 
volved as well as to the tactics of not saying too 
much) it may still be possible to express a diffe- 
rent standpoint by saying that more (or less) 
money should be given to a given field of opera- 
tions. There are quarrels about input figures, 


which may or may not have something to do. 


with actual activities. It seems as if political de- 
bates in Swedish municipalities take place in the 
budget arena more often than is the case in state 
administration. The tolerance for conflict is 
higher; there are fewer observant and suspicious 
neighbours. Nevertheless, even issues concern- 
ing bus time-tables and hospital queues are dis- 
guised as numbers. 

As already mentioned, state agencies have to 
be careful about appearances. Conflicts must not 
be seen and precautions must be taken. The time 
when agencies send their budget proposals to 
their ministries, and then to the Ministry of Fi- 
nance, is fraught with potential conflict. When 
state budgeting was studied (Czarniawska, 
1985b), the austerity mentality prevailed and 
most agencies were supposed to cut their 
budgets back by 2% per year. 

Table 1 present the budget proposals of those 
agencies which could be compared with others 
(public utility companies are not included) 

In fact, 22 of the 34 agencies proposed an ex- 





panded and not a cutback budget (and among 
the remaining 12, only 6 obeyed the instruc- 
tion), with in one case, the proposed expansion, 
being as high as 20%. This fact in itself suggests 
a special role that a budget proposal fulfills. In- 
stead of starting a debate with the Minister of Fi- 
nance (which would be perceived as illegiti- 
mate, risky and inevitably conflict-building) the 
agencies managed to convey their opinions by 
the means of budget proposals. The point was 
made and no offence was taken! Verbal argu- 
ments, in the form of an introductory letter to 
the proposal, were not elaborate and hardly orig- 
inal (see Tarschys & Eduards, 1975, for budget 
arguments): the threat of impoverished quality 
and a growing demand for the agency’s ac- 
tivities. 

In yet another study of municipalities (Czar- 
niawska-Joerges, 1987) the “budget techni- 
cians” — the officers responsible for budget 
techniques — looked for new, more compli- 
cated procedures in order to make the language 
hermetic to politicians. Overwhelmed by tech- 
nical complexities, the politicians frequently 
had to give up an attempt to discuss the budget. 
Their imprecise, value-laden arguments were 
formulated in a verbal form which officers re- 
fused to recognize. They offered the language of 
numbers in its place, and its grammar proved to 


` be inpenetrable for the politicians. In this way 


one could avoid both conflict and undesirable 
changes.* 


“The table remains a purely illustrative device, as some of the calculations were done by us, some by agencies themselves, 
and the basis does not have to be the same. Indeed, some agencies calculated their cutbacks from non-appropriated budgets, 


which is quite an heroic step. 


“To contrast this use of budget — as a polite and/or hermetic language which allows differences and avoids confrontations 
— we Can quote an example of the use of plans in Polish state-owned enterprises. Annual plans contain both budget proposals 
(investments and expenditures) and sales estimates. They are sent for approval to higher — political and economic — 
authorities. As a rule, the first part (the budget) is then cut down and the second (the financial results) is increased. As a rule, 
the next annual pian follows the same procedure. Partly it is a matter of a lack of trust: the enterprises assume that their 
budgets will be cut and the authorities assume that enterprises exaggerate. But if everybody knows that, this tiresome and 
irritating (for both sides) ritual could be eliminated. However, it is more than a ritual. The enterprises do not care much about 
-cuts. But when it is time to balance the books, they will triumphantly show that the reality was closer to their estimate and 
not to the authority's! In a country where talk is very light-weight, numbers are heavy arguments, and the numbers will be 
used in a fierce fight over the next annual plan. Conflicts are welcome: they show the employees that their managers do care 
about the organization and are prepared to fight against the authorities whenever necessary! Numbers become bullets and 


the budget is a bomb with a delayed fuse. 


~ 
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TABLE 1. Budget proposals 











Agency Accepted Appropriated Cut-back Budget 

no budget budget budget proposal 

1984/85 1984/85 1985/86 1985/86 

(100SKr) (% change) (% change) (% change) 

1 4,615,436 +8.3 -20 +25 
2 64,361 +4.2 —1.0 -0.9 
3 5,677,738 +7.0 -2.0 +0.8 
4 12,924 +6.2 ` —2.0 +44 
5 20,072,625 +5.0 —2.0 +2.0 
6 663,800 +5.0 —2.0 —3.0 
7 54,915 +63 —3.4 +19 
8 100,776 +6.0 —1.0 0.0 
9 95,374 +49 —1.0 +2.0 
10 21,427 +9.2 -2.0 ¥ —1.6 
11 194,581 +5.9 —2.0 —2.0 
12 52,127 +4.2 —2.0 —1.8 
13 66,272 +4.0 0.0 +3.7 

14 164,475 +11.6 —2.0 +20.0- 
15 354,370 +6.2 -1.4 +2.7 
16 39,014 +8.3 0.0 +3.0 
17 31,861 i +5.5 -1.5 +0.5 
18. 34,948 +6.0 —1.0 +16.3 
19 57,057 +6.8 ~2.0 +3.0 
20 688,057 +65 —2.0 +15 
21 14,047 +6.2 —2.0 +42 
22 36,791 +5.4 —2.0 —2.0 
23 6,933,000 +8.8 —2.0 +12.7 
24 56,334 +4.8 —2.0 —2.0 
25 58,832 +45 ~2.0 +0.1 
26 81,368 +8.6 —2.0 -2.8 
27 30,829 +14 -1.8 +0.5 
28 139,904 +5.7 —2.0 +0.6 
29 26,133 +63 ~1.6 —1.6 
30 1,091,947 +6.0 —2.0 +11 
31 15,722 +5.7 —2.0 +2.2 
32 38,477 +5.0 -2.5 -2.5 
33 . . 47,140 +6.0 +2.0 +3.8 
34 344,404 +7.8 -1.5 +14 





BUDGETING IN THE LAND OF REFORMS 


Apparently, budgets are good things to re- 
form. Over the years, we have witnessed several 
attempts at changing the activities that are car- 
tied out in the budget processes in the state ad- 
ministration. It seems as if a budgetary reform 
regularly recurs at intervals of 8—10 years, and, 
much to our surprise, it seems to be the same re- 
form that pops up every time. Rationalistic and 
goal-oriented ideas have been dusted off and 
proposed in a slightly different form over and 


over again. We will briefly describe three such 
reforms: program budgeting in the 1960s; the 
continuation of the development of the SEA-sys- 
tem in the 1970s; and the three-year budget in 
the 1980s. 

In the late 1960s, program budgeting was 
proposed as tbe reform that would make ad- 
ministration effective as well as efficient. The 
Swedish version of program budgeting was 
focused on the administrative level. The Report 


on Program Budgeting was prepared by a state 


agency, not by a public commission or by a 
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ministry. Key concepts in the proposal were the 
specification of goals, the specification of alter- 
native programs and the measurement of costs, 
performance and the effects of different pro- 
grams. Productivity and effectiveness (mea- 
sured as a relationship between costs and goal- 
fulfilment) were stressed and it was very clear 
that these ideas originated in the private sector. 

Several agencies became experimental sites 
where the new budget techniques were to be 
tested. In all other agencies, goal defining and 
measurements of costs, performance and effects 
were introduced. At this stage, the introduction 
of program budgeting did not meet with any ob- 
jections in society as a whole; in fact it was 
hardly ever discussed in public (Andrén, 1975). 
Government and Parliament, however, were not 


‘taking part in these experiments. Later on, it was 


-- proposed that all political and administrative 


levels should work in accordance with the 
proposed ideas, but by then the enthusiasm for 
program budgeting was evaporating. 

Andrén comments (with implications that we 
will: return to later) that “the reform has been 


made in line with the predominating mood of 


the Swedish society since World War Il... Its 


- main watchwords have been ‘rationality’, ‘effi- 


ciency’, ‘progress’ and the like” (1976, p. 343). 
And it was clear that this reform was inspired by 
business organizations (Amnå, 1981). The com- 


_ mittee’s experts were specialists in business ad- 


x ministration. In 1973, when the proposal re- 


introducing program budgeting throughout the 
entire political system appeared, its advocates in 
the Budget Commission were economists. And 
this time, the proposal — titled Budgetreform — 
was turned down. Program budgeting tech- 
niques were not popular anymore, although the 
Government clearly declared that “program 
budgeting ideas are here to stay”. The ideas, 
however, had to be presented somewhat differ- 
ently. f 

What happened then with the experiments of 
the 1960s? Despite the fact that the reform was 
initially presented as a considerable success (An- 
drén, 1976), some serious problems did occur. 
As the later evaluations showed, it was difficult 
to live up to the ambitions. Measuring, or even 
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describing, outputs and impacts turned out to be 
very difficult for most of the agencies. Budget 
dialogues had not changed significantly, and it 
proved extremely: difficult to couple costs, on 
the one hand, and outputs or impacts, on the 
other hand (Riksrevisionsverket, 1974; Ar- 
vidsson & Osius, 1977). Hence, everybody still 
loved the general idea of rational goal-directed 
behaviour, but the ideals did not come true... 
this time. 

The next reform, in the 1970s, was called a 
“modernization of the budgeting system” (Ar- 
vidsson, 1977), and it was also carried out on the 
agency level. This time it was the National Audit 
Bureau that took up its cudgels to fight for the 
ideas, as a development of the State Economic- 
Administrative System (SEA). The SEA system 
was “a package of planning and evaluation 
methods as well as information systems to be 
used by agencies .. . In essence it represents the 
practical application of the program budgeting 
concepts within the government sector” (Ar- 
vidsson, 1977, p.2). Considerable effort was 
spent on developing agency planning and 
evaluation systems and the reform was pre- 
sented as a continuation of the old one. 

The National Audit Bureau organized confer- 
ences devoted to information about the efforts 
in administrative development and administra- 
tive effectiveness. And they evaluated these ac- 
tivities in order to find out how many of the 
agencies really did try to apply the ideas. The 
National Audit Bureau helped the Ministry of 
Finance in developing new budget requests. 
With the introduction of budget requests — 
which were also based on ideas about planning, 
goals, comparison of alternatives, etc. — they 
also checked upon whether the agencies fol- 
lowed these instructions. 

Considerable effort was put into improving 
the financial and administrative capabilities of 
state agencies. Arvidsson says: “The develop- 
ment of modern economic-administrative con- 
cepts and methods began in the early 1960s. 
There has been a steady progress. The ways of 
approaching problems have been pragmatic” 
(1977, p.4). No drastic changes, no revolutions, 
no conflicts, but “a commitment to the need and 
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desirability of improving management in the 
public sector”. It is appropriate at this juncture 
to return to Andrén, who noted that the image of 
success for the program budgeting system in 
Sweden “may well be explained by the fact that 
the final moment of truth has not yet been 
reached” (1976, p.357). 

On the other hand, in a culture dedicated to 
continuing reforms and steady progress, final 
moments never really come. Reforms are con- 
tinuous processes; with neither beginnings, nor 
ends. And the “rational and goal-directed be- 
haviour” solution has, over the years, been con- 
nected with different kinds of problems. In the 
1960s, the problem was inefficiency. In the 
1970s, the problem was stagnation. And, as we 
shall see, in the 1980s, the problem was short- 
sightedness. A new reform called “three-year 
budgeting” was launched in the middle of the 
1980s, after a proposal from the Verksled- 
ningskommittén. 

The idea was that agencies should be given an 


opportunity to develop their professional com- , 


petence in achieving the goals that had been de- 
cided by Parliament and Government. Multi- 
year budget frames were to be used: agencies 
were to be given more freedom in the allocation 
of their resources both in time and over time. As- 
sessments of efficiency were seen as being very 
important, and they had to be explicitly ac- 
counted for in budget documents. Every three 
years, agencies were to present a report evaluat- 
ing the activities of the last three years, including 
proposals for the next few years. 

The ideas live on. Goals are viewed as control 
instruments. Details must not be controlled. 
Efforts ought to be put into the analysis of the re- 
lationships between costs and effects. Measure- 
ments of productivity and efficiency are to be 
developed. Planning must be more long-sighted, 
through the use of multi-year budget frame- 
works. Reformation continues, this time with 
more emphasis on ex post evaluations of on- 
going activities, which is a slightly different em- 
phasis compared with that of the 1960s, and the 
early 1970s when the focus was on ex ante plan- 
ning. Everything takes place inside the budget- 
ary process. 


Therefore, budgets have been amiable objects 
for continuous reforms, and budgetary reforms 
make it possible to achieve a balance between 
the past and the future in a way that makes it im- 
possible to speak about either starting points or 
dead ends. Reforms represent continuity 
through change, seemingly paradoxical but al- 
together very attractive values in Swedish or- 
ganizational life (Jacobsson & Sahlin-Anderson, 
1985). What Andrén said about program budget- 
ing in the 1970s holds true for the present (and 
possibly a next) reform: “Ambitions... remain 
high and the bodies directly responsible for its 
development have been pleased with its pro- 
gress” (p.357). 

Reform is an ideology carrier which makes it 
possible to claim that changes are at hand, with- 
out experiencing the difficulty of criticizing 
what has been. The past was not bad at all, but 
the future will be even better, and reorientations 
are continued. These are the values stressed in 
the budget reform proposals. Other values, em- 
phasizing conflicts and differences in resources 
between different actors that may be even grea- 
ter with the reforms, are not stressed at all. 
Budget reforms refresh rituals of reason, budget 
reforms up-date the language of consensus; and 
budget reforms signal continuity through 
change. 

This gentle, continuous reformation is possi- 
ble in a stable political climate with a recent his- 
tory of economic success. Reform is necessary in 
order to demonstrate progress and the aware- 
ness of the parties involved, but it does not re- 
quire public regrets and public accusations, a 
cutting off of the past or scapegoating — in short, 
all the symptoms typical of, for example, reforms 
of socialist economies. In a socialist economy, 
the reformation is clearly of a religious charac- 
ter: a general purification must be achieved 
before the first steps can be taken along the New 
Way (usually quite old). In Sweden, a cool 
breeze of reason encircles the change, which 
consists of a confirmation of how good the old 
ways really were. 


BUDGET IN A COLD CLIMATE 
Models of budgeting seem to emphasize the 
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instrumental view of budgeting, while real 
world budgeting gives a more complicated pic- 
ture. Practitioners from the public sector tend to 
be very amused when researchers point out the 
more symbolic and cultural aspects of budget- 
ing. It often seems that they do not feel uncom- 
fortable or embarrassed as they listen to our in- 
terpretations. Nor are they annoyed by observa- 
tions that make a stir in the research community. 
And while they can accept “rain-dances” and 
“battles” as metaphors for real world budgeting, 
they still think that budget reforms should be de- 
signed in a proper, ie. rationalistic, way. 

One reason for this attitude is that practition- 
ers do not confuse the “is” and “ought” of 
budgeting. We — and here we speak of ourselves 
as astonished researchers — often tend to think 
that the existent (“is”) is something that should 
be transformed into the desirable (“ought”), and 
that something is clearly wrong when this does 
not happen. A common way of approaching the 
phenomenon is to look upon it as a problem of 
implementation. Explanations of why effects and 
impacts often do not live up to intentions can be 
different: lack of bureaucratic competence, con- 
flicts between different groups, ambiguous in- 
tentions, etc. This notion, however, of ideals as 
something that should be transformed into real- 
ity may be a misleading one. 

Sartori, in his treatise on democratic theory, 
offers another view (Sartori, 1962). An “ought”, 
according to Sartori, is not meant to replace “is”, 
but“... it is meant to be a counterweight, which, 
is a completely different matter. The ought is al- 
ways excessive, it smacks by hybris. The reason 
for this is that ideals are born from our dissatis- 


faction with reality and have a polemic function, 
a countervailing role” (Sartori, 1962, p.64). 
Therefore, ideals are unreal and so they should 
be. When attempts to enforce ideals are not very 
successful, this should not be interpreted as 
being due to deficiencies in the ideas. Following 
Sartori, “ . . . ideals are not made to be converted 
into facts, but to challenge them” (p.65). 

Rationalistic budget reforms do “smack with 
hybris”; that is clear. And practioners do seem to 
perceive them as counterweights to the practice 
they daily encounter, drowned as they are often 
in piles of gloomy budget documents. They 
know that budgeting in practice seldom looks 
like the ideals that are presented in textbooks, 
and they are comfortable with ideals that point 
out that budgeting should be something diffe- 
rent. Budget reforms in their eyes can be viewed 
as temporary Utopias, i.e. states that will never 
be reached but which, for the time being, can 
give some comfort by pointing out that the or- 
ganization is probably heading in the right direc- 
tion. And we do not expect the Ancient Greeks 
to be right this time: hybris does not inevitably 
lead to nemesis. 

Whether we want to understand dreams 
about budgeting, its actual practices or the dif- 
ference between the two, we do best by relating 
them to the contexts in which they take place. 
The assumption of universality, typical for cer- 
tain traditions in organization theory, proves 
limiting here. It is fruitless to repeat that budget- 
ing practices do not reflect budgetary ideals. 
Both reflect something else — crucial elements 
of the context in which they are embedded. 
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Abstract 


In this study, we conduct a laboratory experiment to evaluate three alternative mechanisms for pricing 
transfers between a buyer and a seller. The three mechanisms are: (1) a direct negotiation mechanism with 
which either party suggests terms of trade that the other can accept or reject; (2) a traditional mechanism 
where the transfer price and quantity are determined by the intersection of the reported demand and sup- 
ply schedules of the buyer and seller, respectively; and (3) the Ronen-McKinney mechanism which com- 
putes the transfer quantity in the same way as the traditional mechanism, but is designed to give different 
transfer prices to the buyer and seller in order to induce them to truthfully report their respective 
schedules. For all periods of the experiment, the direct negotiation mechanism is the least efficient of the 
three due to the high frequency of bargaining impasses. The efficiency of the traditional mechanism and the 
efficiency of the Ronen-McKinney mechanism are indistinguishable from one another. However, in the last 
four periods of the experiment where the data exhibit less volatility, the efficiencies of all three mechanisms 
are indistinguishable from each other. When truthful reporting is considered, the Ronen-McKinney 
mechanism had the least amount of misreporting followed by the traditional mechanism and then the direct 


negotiation mechanism. 


A great deal of effort has been focused on design- 
ing mechanisms for the transfer pricing problem 
in a vertically integrated firm that decentralizes 


through profit centers (e.g. Kaplan, 1982; An- 


thony et al., 1984). Beginning in the mid 50s 
(Dean, 1955) and continuing into the present 
(e.g. Swieringa & Waterhouse, 1982), this re- 
search has focused on the ability of alternative 





transfer pricing mechanisms. to satisfy two nor- 
mative criteria. These criteria require that: (1) a 
mechanism should provide incentives for self- 
interested managers to transfer a quantity which 
maximizes the firm’s overall profit, and (2) the 
information that is received through the opera- 
tion of a mechanism should allow central man- 
agement to evaluate the contribution of each 
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profit center without sacrificing the autonomy 
of the centers. The techniques that have been 
employed in this research include formal eco- 
nomic analysis, mathematical programming 
methods and behavioral approaches (Abdel- 
khalik & Lusk, 1974). Yet, to date, there has been 
no empirical evaluation of the performance of 
any of the proposed transfer pricing 
mechanisms.’ 

In this study, we report the results ofa labora- 
tory experiment which was designed to evaluate 
the performance of three alternative transfer 
pricing mechanisms. These mechanisms are 
applicable for pricing transfers between profit 
centers when each manager’s compensation 
scheme is based upon his center’s profits. In this 
experiment, we restrict our attention to those 
situations where there is no external market for 
the commodity being transferred. The presence 
of such a market would require the simultaneous 
determination of the external market price and 
the transfer price. Since this is our first attempt at 
examining alternative transfer pricing mechan- 
isms, we have foregone this additional complex- 
ity and have concentrated solely on the 
mechanisms’ performance in bilateral bargain- 
ing situations. 

Our laboratory experiment consists of an ex- 
perimenter, whose actions vary with the transfer 
pricing mechanisms, and bargaining pairs, each 
pair consisting of a buyer and a seller. The trans- 
fer pricing mechanisms are then implemented in 
a manner such that the bargaining pairs engage 
in decision-making over a sequence of trading 
periods to determine a transfer price and quan- 
tity to be transferred. Since the experimenter 
never imposes price or quantity decisions, the 
autonomy of both the buyer and the seller is 
maintained. We use the data from the experi- 
ment to evaluate the efficiency and incentive 
compatability of each mechanism. Efficiency 
measures the ability of each mechanism to 
achieve an outcome which maximizes the com- 
bined profits of the buyer and seller, while 
incentive compatibility measures the accuracy 








of the information reported by the buyer and sel- 
ler. Since the mechanisms we consider all pre- 
serve divisional autonomy, the incentive com- 
patibility measure indicates the reliability of the 
headquarter’s assessment of each individual’s 
profitability based upon the reported informa- 
tion. 

The first of the three alternative mechanisms 
permits direct negotiations between the buyer 
and the seller in that either party can suggest 
terms of trade that the other party may accept or 
reject. With this mechanism, the headquarters 
serves only as a conduit for the exchange of 
transfer price proposals between the two par- 
ties. If both parties behave non-strategically, 
they should agree to transfer the efficient quan- 
tity because this quantity maximizes their com- 
bined profits. However, critics have argued that 
this mechanism heightens conflict between the 
bargaining parties (Cyert & March, 1963). 
Furthermore, Coursey (1982) has shown that 
impasse may become a bargaining tool with 
direct negotiations. If similar results are ob- 
served in our environment, each bargaining 
pair’s ability to maximize their combined profits 
will be impaired. Finally, Dopuch & Drake 
(1964) have suggested that with this 
mechanism, performance evaluation is based on 
the ability to negotiate rather than on perform- 
ance itself. In turn, this should reduce the accu- 
racy of the information reported by the buyer 
and the seller when they engage in direct negoti- 
ation. 

The second mechanism is based upon the 
traditional view of transfer pricing (e.g. Hirsh- 
leifer, 1956), where each party announces their 
respective supply and demand schedules and 
the headquarters proposes a transfer price and 
quantity given by the intersection of the two re- 
ported schedules. This transfer pricing 
mechanism leads to an efficient transfer if both 
parties truthfully reveal their information. How- 
ever, with this mechanism, each party has an in- 
centive to misrepresent his private information 
at the expense of the other party’s profit and 


The only empirical work to date consists of descriptive surveys of transfer pricing mechanisms currently used by firms (c.g. 


Eccles, 1983). 
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combined profits. These misrepresentations will 
impair the incentive compatibility of the tradi- 
tional mechanism which, in turn, should reduce 
its efficiency. 

The third mechanism, which was suggested 
by Ronen & McKinney (1970), has the impor- 
tant theoretical property that it can lead to an 
equilibrium which is incentive compatible. As 
with the traditional mechanism, the buyer and 
seller report their demand and supply schedule 
to the headquarters which, in turn, proposes a 
transfer price and quantity. The proposed price 
and quantity with the Ronen-McKinney 
mechanism is, in general, different than that 
proposed with the traditional mechanism. In 
particular, while the traditional mechanism 
demonstrates what the optimal transfer price 
and quantity should be, it fails to suggest a way of 
implementing a procedure to accomplish this 
objective. By treating the buyer as a perfectly 
discriminating monopsonist, the seller as a per- 
fectly discriminating monopolist and having the 
headquarters subsidize the buyer and seller, the 
Ronen-McKinney mechanism attempts to over- 
come this implementation problem. The diffi- 
culty with this mechanism is that while truth- 
telling leads to one equilibrium it is not the only 
equilibrium. We will discuss this difficulty in 
further depth in the next section. 

The theoretical properties of each of the three 
mechanisms have been developed within the 
context ofa single period model. Taken literally, 
this requires that the buyer and seller have only 
one opportunity to interact, where they each 
simultaneously report their private information 
as required by the mechanism and based on their 


reports, the mechanism determines a transfer 


price and quantity. After the transfer is made, the 
two parties compute their respective profits and 
never interact again. 

As described earlier, our experimental 
environment does not correspond exactly to the 
above scenario since the buyer and seller 
interact repeatedly — both within and across 
trading periods. Thus, we employ the “off the 
domain” approach which has been used in much 
of the recent experimental economics literature 
(see Plott, 1982; Smith, 1982). With this 


approach, one inquires about the ability of exist- 
ing models to predict behavior in an environ- 
ment that has some subjectively determined, in- 
teresting properties. Even in situations where 
one or more assumptions are clearly violated, it 
is not uncommon for a model to accurately cap- 
ture the observed behavior. 

In the next section, the experimental design is 
outlined, along with a description of the labora- 
tory experiment used to analyze the three alter- 
native solutions to the transfer pricing problem. 
This section also includes the predictions of the 
three transfer pricing mechanisms. In the 
Results section, we present summary measures 
of the experiment and provide a discussion and 
interpretation of them. A summary and some 
suggestions for future research are given in the 
final section. i 


THE LABORATORY EXPERIMENT 


Each mechanism was tested on nine separate 
bargaining pairs; thus, the experiment consisted 
of twenty-seven bargaining pairs, all of whom 
were student volunteers of the University of 
Iowa. The experiment was conducted in nine 
separate sessions with each session consisting of 
three bargaining pairs. The same mechanism was 
imposed on all three pairs in each session. Par- 
ticipants were randomly designated as either 
buyers or sellers. To maintain anonymity, buyers 
arrived about 10 minutes before sellers and 
were seated in a separate room from sellers. The 
buyers were also excused ahead of the sellers. 
When all participants had arrived, the instruc- 
tions (which are reproduced in the appendix) 
were read separately in each room and questions 
were answered. Participants were then asked to 
complete a set of practice calculations (also 


reproduced in the appendix) which was de- 


signed to verify that they understood the trading 
procedures, the information available in the ex- 
periment and their profit calculations. 

Value for the units being traded was estab- 
lished by application of induced value theory 
(Smith, 1976). Specifically, the commodity 


‘being transferred was given value by the rules 


governing the transfer and the marginal cost and 
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marginal revenue schedules given by the seller 
and buyer, respectively. In our experiment, each 
buyer was given a marginal revenue schedule 
which was his private information. This 
schedule contained the amount the experi- 
menter would pay. him for each of the units 
purchased from the seller during a trading 
period. The difference ‘between each unit’s 
revenue and its purchase price is the buyer’s 
profit which was his to keep. Correspondingly, 
each seller’s private information was his margi- 
nal cost schedule, i.e. the amount the seller must 
pay to the experimenter for each of the units 
sold to the buyer during a trading period. A sel- 
ler’s profit, which was his to keep, is the differ- 
ence between each unit’s selling price and its 
‘cost, 






Quantity 


Figure 1 presents the parameter values that 
were used by each bargaining pair in the experi- 
ment. From this figure, it can be seen that the 
optimal quantity to be transferred is six or seven 
units since the combined profits of the buyer 
and seller, $1.92, are maximized at this quanity. 
Further, the transfer price associated with this 
quantity is $2.80.” 

Each bargaining pair participated in ten deci- 
sion-making periods. After the first period, each 
successive period was a strict replication of pre- 
vious periods in the sense that the same transfer 
pricing mechanism was employed and each 
buyer’s (sellers) marginal revenue (cost) 
schedule was the same. At the conclusion of 
these ten periods, each participant cumulated 
his profits from all trading periods and was paid 


Marginal cost 





2.80 Marginal revenue 


ON ae ee ee ee ee 


_ Fig. 1. Parameter values. NY buyer’s profit which is $0.96, (seller's profit which is $0.96, N + ZA combined profits of buyer 
and seller which equal $1.92. 





. #Note, bowever, that with a transfer price of $2.80, both the buyer and the seller earn zero profits on trading the 7th unit. 
Since traders should be indifferent towards exchanging this unit, we are unable to provide a precise prediction about whether 
or not this unit will be transferred. One approach to overcoming this difficulty. would be to adopt the procedure of Plott & 
Smith (1978) who paid participants 2 5¢ commission on each unit that they traded. However, they prevented traders from 
shifting their schedules by this amount by prohibiting buyers (sellers) from reporting more (less) than their marginal 
revenue (cost) schedule. We are reluctant to use this techniques here since, as will be seen below, some of the more interest- 
ing equilibria in the Ronen-McKinney mechanism occur when buyers (sellers) overreport (underreport) their private valu- ` 
ation schedules. Consequently, the optimal quantity under each mechanism is 6 or 7 units. 
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this amount in cash. Payments ranged from 
about $8.00 to. $19.00. 

The treatment variable in the experiment is 
the transfer pricing mechanism. A mechanism, 
however, is simply a rule which specifies a trans- 
fer price and quantity given the information re- 
ported by the buyer and seller. To implement 
such a rule or mechanism, a process must be 
specified which defines how information is 
transferred, how agreement is reached, and how 
payoffs are made. While there are conceivably 
many such processes, we chose one particular 
process in which each transfer pricing 
mechanism was imbedded. The process we 
chose was conducted as a three stage iterative 
procedure where: (1) each participant sent a 
private message to the experimenter; (2) the ex- 
perimenter calculated proposed transfer prices 
and quantities and sent this message to both par- 
ties; and (3) the buyer and seller either accepted 
or rejected the proposal. If both accepted the 
proposal, an agreement was reached and each 
participant computed his profits for the period 
based on this agreement. If either rejected the 
agreement, the participants repeated the above 
procedure. If the 10-minute time limit was ex- 
ceeded before an agreement was reached, each 
subject earned zero profits for that period. 


The direct negotiation mechanism 

With direct negotiations, the buyer’s message 
consisted of a bid indicating how much he was 
willing to pay the seller for 1 unit, 2 units, etc., up 
to and including 10 units. The seller’s message 
consisted of an offer indicating the price 
required to sell 1 unit, 2 units, etc., up to and 


including 10 units. The experimenter simply. 


transferred the buyer’s proposal to the seller and 
vice versa. There could be only one proposal 
from either the buyer or seller outstanding at 
any point in time and the party receiving the 
proposal had to either accept or reject it. An 





agreement occurred when one party accepted 
one of the price and quantity combinations the 
other party had proposed. 

If both parties behave eel they 
should transfer the efficient quantity since this 
quantity maximizes their combined profits. 
Further, if they split these profits equally, the 
transfer price will be $2.80, the efficient transfer 
price for the parameter values in Fig. 1. How- 
ever, each party may have an incentive to behave 
strategically. This occurs whenever one of the 
parties believes that he can manipulate the out- 
come to earn larger profiits. For instance, a seller 
has an incentive to sell less than the efficient 


' quantity if he believes that he will receive a price 


which will increase his profits. 


The traditional mechanism 

With the traditional mechanism, both the 
buyer and seller provided schedules to the ex- 
perimenter. The schedule reported by the buyer 
(seller) indicated how much he was willing to 
pay for (required to sell) 1 unit, 2 units, etc., on 
up to and including 10 units. The experimenter 
then proposed a transfer price and quantity. This 
proposed price and quantity was determined by 
the intersection of the two reported schedules. 
Either the buyer or seller could veto the pro- 
posal.’ If both accepted the proposal, a binding 
agreement was made based upon the proposal. . 

If both parties truthfully report their respec- 
tive schedule, an efficient transfer should occur. 
However, each party has an incentive to mis- 
report in this environment. To see this, re- 
examine the parameter values in Fig. 1. If the 
buyer truthfully reports his marginal revenue 
schedule then, relative to reporting truthfully, 


, the seller can increase his profits by reporting 


the marginal cost of his first four units truthfully 
and reporting that his marginal cost for the fifth 


‘and all additional units is $2.90 each. The tradi- 


tional mechanism would then transfer five units 


*Since individuals could only trade an integer-valued number of units, there is sometimes a range of prices at which the quan- 
tity demanded and supplied are equal with the traditional mechanism. That is, a bid submitted by the buyer for the last unit 
transacted may exceed the offer submitted by the seller to sell that unit. When this occurred, the average of these prices was 
proposed as the transfer price. Since both the buyer and seller could veto such a proposal, this rule should have no effect on 


the outcomes. 
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and set a transfer price of $2.90. Such a reporting 
strategy would increase the seller’s profits from 
$0.96 to $1.40. However, the buyer’s profits 
would decrease from $0.96 to $0.40 and com- 
bined profits would also be reduced from $1.92 
to $1.80. Similarly, it can be shown that when 
the seller reports truthfully, the buyer has an in- 
centive to understate his marginal revenue infor- 
mation. 


Tbe Ronen-McKinney mechanism 

Trading procedures in this mechanism are 
exactly the same as the procedures for the tradi- 
tional mechanism except for the determination 
of the transfer price. As before, the proposed 
quantity to be traded is the amount which 
equates the reported marginal cost to the 
reported marginal revenue. However, the pro- 
posed transfer price which the buyer must pay is 
the seller’s reported average cost for providing 
that quantity. The proposed transfer price the 
seller receives is the buyer’s reported average 
revenue from receiving that quantity.‘ 


To illustrate how this mechanism operates, let _ 


the lines MC* and AC™ in Fig. 2 represent the sel- 
ler’s true marginal cost and average cost, respec- 
tively, and let the lines MR* and AR* represent 
the buyer’s true marginal revenue and average 
revenue, respectively. The efficient quantity to 
be transferred is Q* since combined profits, AOB 
are maximized at this quantity. If the buyer and 
seller report their true marginal cost and 
revenue information, the seller’s transfer price is 
P, and the buyer’s transfer price is P, and the 
quantity Q* is transferred. The profit to the seller 
is OP,CB, the profit to the buyer is AP,,DB and the 
subsidy is P,P,DC. Furthermore, since OP, E = 


EBD and AP,F = FBC (because, by construction, , 


the total amount paid by the buyer equals the 
sellers reported total cost, OP,DQ* = OBQ*, 
and the total amount received by the seller 
equals the buyer’s reported total revenue, 
OP,CQ* = OABQ*), combined profits are OP,CB 





+ AP,DB — P,P,DC = AOB. For the parameter 
values used in the experiment, the transfer price 
for the seller and buyer is $2.96 and $2.64, re- 
spectively, when the quantity is 6 units ($2.94 
and $2.66, respectively, when the quantity is 7 
units). This implies a profit of $1.92 for the 
buyer and $1.92 for the seller; consequently, the 
subsidy is also $1.92. 





Quantity 


Fig. 2. Ronen-McKinney transfer pricing mechanism. OP,CB, 
seller’s profit; AP,DB, buyer's profit; P,P,DC, subsidy; AOB, 
combined profits where OP,CB + AP,DB — P,P,yDC = AOB. 


With this mechanism, truthful reporting is a 
Nash equilibrium strategy for both the buyer and 
seller. To illustrate this, suppose the seller 
reports his true marginal cost schedule to the 
experimenter. This reported information then 
becomes the buyer’s marginal cost schedule. ` 
Since the seller's reported marginal cost equals 
the buyer’s marginal revenue at Q*, the buyer’s 
most preferred transfer quantity is Q*. Since the 
schedule reported by the buyer only affects the 
quantity he will receive from the seller, the 


‘Moreover, the Ronen-McKinney mechanism maintains the autonomy of the buyer and seller because either the buyer, the 
seiler or both can select the quantity to be transferred without having that decision monitored by the headquarters. In our 
experiment, the experimenter proposed both a quantity to be transferred and a separate transfer price for the buyer and 
seller, respectively. Since either the buyer or seller could veto any proposal, this procedure also maintained their autonomy. 
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buyer will report information such that the 
headquarters will propose Q*. Hence, the buyer 
can do no better than report his true marginal 
revenue information. Correspondingly, the sel- 
ler’s true marginal cost information is his best re- 
port given the buyer sends his true marginal re- 
venue information. With this reported informa- 
tion, the headquarters will propose Q*, P, and P, 
and both parties should accept this proposal. 
Thus, an incentive-compatible Nash equilibrium 
exists with the Ronen-McKinney mechanism. 
Unfortunately, while truthful reporting is a 
Nash equilibrium, it is not the only equilibrium 
that exists with the Ronen-McKinney mechanism 
(Groves & Loeb, 1974). Alternative Nash 
equilibria result whenever each party’s reported 
schedule intersects the other party’s true 
schedule at the same quantity. Of particular 
interest is the equilibrium where the combined 
profits of the buyer and seller are the largest. In 
this case, the equilibrium strategies are for the 
seller to understate his marginal cost as $0.00 for 
all 10 units and for the buyer to overstate his 
marginal revenue as $3.00 for all 10 units.’ Based 
on this information, the headquarters will 
propose that ten units be transferred where the 
proposed transfer price for the seller is $3.00 
and the proposed transfer price for the buyer is 
$0.00. Profits of the buyer and seller are $28.58 
and $2.58, respectively, which are greater than 
their respective profits at the incentive-compat- 
ible Nash equilibrium. Thus, this equilibrium 
leaves both parties better off than at the truthful 
equilibrium arid it seems plausible that both par- 
ties would discover this method for exploiting 
the mechanism to their mutual advantage. - 


panels with the first panel displaying the time 
series of transfer quantities and the second panel 
displaying the time series of transfer prices for 
each of the nine bargaining pairs. A dot in the NA 
row of the quantity and price panels indicates a 
failure of the bargaining pair to reach an agree- 
ment in that period. Figure 5 is organized the 
same way as Figs 3 and 4, with the exception that 
the second panel has two transfer prices in each 
period: the one the seller receives, s, and the one 
the buyer pays, b. 

Our analysis of the data proceeds in two parts:" 
first, we examine the profitability of the deci- 
sions reached under each mechanism; second, 
we evaluate the ability of each mechanism to 
provide incentives for eliciting truthful report- 
ing. In dealing with the data, we are confronted 
with some open problems which arise in experi- 
mental work where the cost of conducting ex- 
periments places a significant constraint on the 
number of observations. As is obvious from the 
figures, there is a high degree of serial correla- 
tion in the decisions made by each bargaining 
pair. While this is suggestive of a learning pro- . 
cess, a convergence process or both, we are 
without a theory about such processes and thus 
are unable to account for their effect. Con- 
sequently, we must provide the usual caveat that 
the statistical tests reported below should be 
regarded more as descriptive measures than as 
classical hypothesis tests. 

It is also known that experimental markets 
tend to converge rather than attain equilibrium 
immediately. Further, many static economic 
models of market behaviour are known to be 


` accurate only after a series of repeated trials in a 
’ stationary environment (Plott, 1982; Smith, 


i 1982). However, no convention has been estab- 


RESULTS 


Figures 3, 4 and 5 present the data from the ex- 
periment for the direct negotiation, traditional 
and Ronen-McKinney mechanisms, respec- 
tively. Figures 3 and 4 are organized into two 





s _lished for the number of replications that are re- 
quired for the data to approach equilibrium 


levels. Due to this, we conduct each of our 
analyses using all the data from the experiment 
and also using only the data from the last four 
periods. For the measures examined, the data 


“For the laboratory environment we examined, these particular values arise since we restricted the buyer from reporting a 
marginal revenue of more than $3.00 and the seller from reporting a negative marginal cost. This was done in order to limit 
the maximum possible payout that participants would earn in the event they found this alternative Nash equilibrium. 
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Quantity transferred 





Period 


Price 
A Se ERAL 
| 
| 
| 
l 
orak 





Fig. 3. Direct negotiation transfer pricing mechanism. NA, no agreement. 


_, from the last four periods are of particular in- 
terest because these data exhibit less volatility 


and relative to the early periods, seem to have . 


“settled down”. Finally, after we introduce each 
of these measures, the evidence on the de- 
creased volatility of each measure is given. 


Are joint profit-maximizing decisions 
achieved? 

To determine whether the decisions made by 
the bargaining pairs maximized their joint profit, 
we constructed an efficiency measure for each 


possible transaction. Efficiency is defined as the . 


ratio of combined profits realized by the buyer 
and seller to their maximum combined profits 
attainable in equilibrium. We examine these 





efficiencies and contrast them by mechanism to 
identify significant differences between the 
mechanisms’ performance. We conclude by 
examining the transfer quantities to identify the 
source of any inefficiencies. 

The maximum joint profit attainable in 
equilibrium is $1.92 (see Fig. 1)..The realized 
joint profit of the buyer and seller could exceed 
this maximum with the Ronen-McKinney 
mechanism due to the subsidy paid to the buyer 
and seller. To compensate for this potential diffi- 
culty, the subsidy paid to the buyer and seller 
was not included in this calculation. Net of the 
subsidy, realized joint profit for the Ronen- 
McKinney mechanism can not exceed $1.92.’ 

The average efficiencies for the three 


When analyzing the last four periods, it should be emphasized that each bargaining pair had collected more information in 
the first six periods than was available from just six buy and sell offers. On average, each pair initiated five buy and sell offers 
per period. Thus, each pair had considered on average 30 buy and sell offers by the end of period 6. 

7An alternative way to view this calculation is that it is the maximum joint profit of the buyer, seller and the headquarters. 
This is also $1.92 because the headquarters’ losses equal the subsidy paid to the buyer and seller. 
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mechanisms are presented in the upper left- 
hand panel of Table 1. These means were calcu- 
lated for all periods and the last four periods. For 
all periods, the average efficiencies for the direct 
negotiation, traditional and Ronen-McKinney 
mechanism, respectively, are significantly less 


‘ than 1.0 using a one-tailed t-test. When the 


means were compared across mechanisms, the 
Ronen-McKinney mechanism had the highest 
efficiency. The traditional mechanism was 
second and the direct negotiation mechanism 
was third. To determine the significance of these 


` differences across the mechanisms, we con- 
- ducted an analysis of variance (ANOVA) with a 


repeated measure design? The results, pre- 





Period 
Fig. 4. Traditional transfer pricing mechanism. NA, no agreement. 


sented in the lower left-hand panel of Table 1, in- ` 
dicate that at least one of the mechanisms is dis- 


| tinct from the other two, since the main effect, 
! mechanism, is statistically significant. Further, 
‘ when each pair of mechanisms is individually 


contrasted, the efficiency of the direct negotia- 
tion mechanism differs significantly from the ef- 
ficiency of both the traditional and Ronen- 
McKinney mechanisms. There is no significant 
difference between the latter two mechanisms. 
There are two possible sources for the signifi- 
cant difference in efficiencies between the 
direct negotiation mechanism and the other two 
mechanisms. First, with the direct negotiation 
mechanism, there could be more periods in 


When reporting the results of statistical tests, we will adopt the convention of referring to a difference as being statistically 
significant if the test has a p value of 0.05 or less. If p is greater than 0.05 but less than or equal to 0.10, the difference will be 
called marginally significant. All other differences will be said to be not significant. Since p values for all of the test results are 
also reported, the reader should have no difficulty in interpreting our results. 

°While ANOVA with a repeated measure design allows for autocorrelation, this correlation is assumed to be constant (see 
Winer, 1971, chapter 7). Since we are unable to claim that this technique will be completely correct for the autocorrelation 
in the data we examine, we must continue to provide a caveat about interpreting any of our tests as classical hypothesis tests. 
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Fig. 5. Ronen-McKinney transfer pricing mechanism. ș price received by seller; & price paid by buyer; ® price received by 
seller equals price paid by buyer; NA no agreement. 


which no agreements were reached. Efficiency | 


is zero in such periods. Second, the agreements 
that are reached could be less efficient with the 
direct negotiation mechanism. To evaluate these 
two possible explanations, we analyzed the 
number of agreements and the efficiencies of 
actual agreements. 

The number of agreements reached with each 
mechanism and the average efficiencies of these 
agreements are’ given in Table 1 in the upper 

‘center and upper right-hand panels, respec- 
tively. Using a one-tailed ttest, the average 
efficiencies for all agreements are significantly 
less than 1.0 for all three mechanisms. However, 
when considering these averages there is no 
apparent difference across the three 
mechanisms. On the other hand, there is a strik- 
ing difference in the number of agreements 
reached between the direct negotiation 
mechanism and the other two mechanisms. This, 
in turn suggests why the direct negotiation 


l 


mechanism attains significantly less efficient al- 
locations than the other mechanisms; there are 
fewer agreements with the direct negotiation 
mechanism. 

We also tested for differences in these mea- 
sures across mechanisms using an ANOVA. The 
results of the ANOVA which analyzes the num- 
ber of agreements is presented in the lower 
center panel of Table 1. The main effect, 
mechanism, is statistically significant and the 
contrasts indicate that this significance is due to 
the direct negotiation mechanism being differ- 
ent from the both traditional and Ronen-McKin- 
ney mechanisms. The traditional and Ronen- 
McKinney mechanisms do not differ from each 
other. In the lower right-hand panel of Table 1, 
we present the results from the ANOVA which 

. analyzes the efficiencies for actual agreements. 
We found no significant differences between 
mechanisms. 

We analyzed the average efficiencies for the 
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last four periods of the experiment and found 
that all three mechanisms continue to perform at 
less than 100% efficiency (see last four periods 
under the efficiency column in Table 1). While 
the Ronen-McKinney mechanism continues to 
give the highest efficiency and the direct negoti- 
ation mechanism continues to yield the lowest, 
the ANOVA results for the last four periods indi- 
cate that none of the mechanisms are statistically 
distinct from the other mechanisms. Similarly, 
the main effect, Mechanism, is not statistically 
significant in either the ANOVA which analyzes 
the efficiencies of the agreements or the ANOVA 
which examines the number of agreements in 
the last four periods. 


For all of the measures presented in Table 1, 
the data from the last four periods seem to have 
settled down. To see this, examine the signifi- 
cance of the time period in each ANOVA. In all 
cases, the main effect, Time, is statistically sig- 
nificant when all data are used. On the other 
hand, when the analysis is conducted on the last 
four periods of data, neither the main effect, 
Time, nor the interaction effect, Mechanism* 
Time, are statistically significant.!° The time 
series of the average transfer quantities and 
prices for agreements presented in Fig. 6 rein- 
forces the conclusion that there is considerably 
less volatility in the last four periods. Since effi- 
ciency depends directly upon the transfer quan- 
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Fig. 6. Mean quantities and prices for agreements. b, buyer’s transfer price; s, seller's transfer price. 





‘Since there is a relatively small number of observations in the last four periods, this lack of significance could be due to the 
low power of the test. To evaluate the ability of this test to detect significance, we also conducted ANOVAs for all performance 
measures using the data from the first four periods. Either the main effect (Time), the interaction effect (Mechanism* Time) 
or both are statistically significant in all of these ANOVAs. Thus, relative to the first four periods, these results demonstrate 


that the data in the last four periods are less volatile. 
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tity, this figure also graphically demonstrates the 
increased stability of this measure towards the 
conclusion of the experiment. 

Finally, the time series of transfer quantities in 
Figs 3, 4, and 5 demonstrate that the inefficien- 
cies are a result of a less than optimal quantity 
being transferred! These inefficiencies 
occurred whenever sellers overreported their 
marginal costs or buyers underreported their 

. marginal revenues or both. For example, when 
sellers overreport their cost, the marginal cost 
schedule shifts up and to the left of the true mar- 
ginal cost schedule in Fig. 1. Correspondingly, 
when buyers underreport their revenues, the 
matginal revenue schedule shifts down and to 
the left of the true marginal revenue schedule in 
Fig. 1. Thus, the “new” marginal cost and 
revenue schedules intersect to the left of the op- 
timal quantity. The quantity transferred is then 
less than the optimal quantity of 6 or 7 units. We 
next examine the results of this misreporting by 
mechanism. 


Is the information reported accurate? 


We address this issue by examining each. 


buyer’s and each seller’s reported schedule for 
-all mechanisms for all agreements and agree- 
ments in the last four periods.'* We computed 
the average absolute deviation of their reported 
marginal revenue and marginal cost from their 


true marginal revenue and marginal cost for the 
first 7 units on their schedules. The tests were 


conducted using these first 7 units. We have cho- 
sen to do this for two reasons. First, since each 
buyers and each sellers schedule did not 


change from one period to the next, it appears 
that they soon learned that the last 3 units were 
extramarginal and paid little attention to the 
amounts they reported for these units. (Recall 
that more than seven units were transferred only 
once.) Second, if the headquarters were to assess 
the profitability of the buyer and seller, it would 
only require the information reported on infra- 
marginal and marginal units. However, we also 
conducted all of our tests using all 10 units and 
we will use footnotes to report when the results 
using all 10 units differ qualitatively from those 
using the first 7 units.'® 

Table 2 presents our analysis of truthful 
reporting. The means of the absolute deviations 
from true marginal cost and marginal revenue 
for the three mechanisms are all significantly 
greater than zero for all agreements and for 
agreements in the last four periods. Comparing 
across mechanisms, the Ronen-McKinney 
mechanism had the smallest average absolute 
deviation while the direct negotiation 
mechanism had the largest average deviation. 
This ranking stayed the same regardless of 
whether all agreements or agreements in the last 
four periods were used. We again conducted an 
ANOVA with. a repeated measure design to 
determine the significance of these differences, 
using all agreements and using agreements in the 
last four periods. The results are presented in the 
lower panel of Table 2. Since the main effect, 
Mechanism, is statistically significant, we then 
contrasted the mechanisms against one another. 
The results of this analysis, which are also re- 
ported in Table 2, suggest that in all pairwise 





''There is only one instance when a quantity greater than the efficient level was transferred. This occurred in the first period 
of the third bargaining pair with the direct negotiation mechanism. , 

"For the traditional and Ronen-McKinneey mechanisms, these were the schedules that led to the proposed contract which 
both parties accepted. For the direct negotiation mechanism, however, the schedule that led to the agreement was used along 
with the one that immediately preceded it (Le. the one previcusly reported by the other party). 

“Occasionally, market participants grossly misstated their true information. So that these few observations would not sig- 
nificantly skew the measures reported here, we censored our data by eliminating any entry where buyers misreported their 
marginal revenue by more than $1.00 or where sellers misreported their marginal cost by more than $1.00. For all agree- 
ments, this procedure deleted 47 observations with the direct negotiation mechanism, one observation with the traditional 
mechanism, and none of the observations with the Ronen-McKinney mechanism. When using only the data for the agree- 
ments in the last four periods, this procedure deleted 20 observations from the direct negotiation mechanism and no obser- 
vations from either the traditional or Ronen-McKinney mechanism. It should be kept in mind that this evidence is consistent 
with the direct negotiation mechanism leading to the most misreporting. 
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TABLE 2. Truthful reporting 











Mean (p-value)* 
Absolute deviation from true information 
All Agreements in 
Mechanism agreements ` last four periods 
Direct negotiation $0.203 $0.176 
(0.001) (0.001) 
Traditional 0.142 0.124 
(0.001) (0.001). 
Ronen-McKinney 0.099 0.090 
(0.001) (0.001) 
ANOVA results 
All agreements Agreements in last four periods 
Source af F p af F p 
Between subject pairs A 
Mechanism 2 18.76 0.001 2 ` 16.58 0.001 
Error (between )t+ 24 24 
Within subject pairs 
Time 2 16.23 0.001 1 0.09 0.76 
Time* Mechanism 4 4.19 0.005 2 0.04 0.96 
_  Error(Time* between )§ 48 24 i 
Buyer—Seller 1 0.68 0.42 1 0.58 0.45 
Buyer—Seller* Mechanism 2 0.27 0.77 2 0.56 0.58 
Error (Buyer—Seller 
*between)§ 24 í 24 
Unit 1 42.14 0.001 1 55.48 0.001 
Unit *Mechanism 2 5.53 0.01 12 7.67 0.003 
Error (Unit *between)§ 24 24 
Contrast|| 
Direct negotiation f 
vs traditional 1 16.42 0.001 1 14.67 0.001 
Direct negotiation 
vs Ronen-McKinney 1 36.17 0.001 1 31.91 0.001 
Traditional 
vs Ronen-McKinney 1 3.91 0.06 1 3.36 0.08 








* The absolute deviation is calculated by taking the absolute value of the difference between 
reported marginal revenue or cost and true marginal revenue or cost. The first 7 units from the 
schedules are used in the absolute deviation calculations. For all agreements (agreements in last 
four periods) there were 68 (48) observations with the direct negotiation mechanism where 
subjects did not report a revenue or cost number. There were no (no) such observations with 
the traditional mechanism and 1 (no such) observation with the Ronen-McKinney mechanism. 
The p value indicates the level of significance for a one-tailed t-test that the absolute deviation is 
greater than zero. 


+ For the F value to follow an F distribution, the sources of variation must be homogenous, A 
check on this assumption using the Fmax test indicates this assumption is satisfied (see Winer, 
1971, pp. 520-521). 


(continued) 
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TABLE 2. Truthful reporting (continued) 


4 Because of the design, there are empty cells with the Time and Unit variables when time is 1-10 
and units are 1-7. Thus, Time is defined in terms of 3 segments (periods 1-4, 5-7, 8—10) for all 
agreements and 2 segments (periods 7-8, 9--10) for agreements in the last periods. Unit is 
defined in terms of 2 segments (units 1—4, 5-7) for both all agreements and agreements in the 


last four periods. 


§ See Winer (1971, p.540) for a description of this design and the construction of the error 


` terms. 


|| The between subject pairs error term is used in this analysis. 


comparisons, each mechanism is statistically dis- 
tinct from the other, although the distinction 
between traditional and Ronen-McKinney 
mechanisms is marginal. This holds for all agree- 
ments. and for agreements in the last four 
periods.’ Further, the truthful reporting mea- 
sures seems to have “settled down” by the end of 
the experiment. The main effect, Time, and the 
interaction effect, Mechanism* Time, are both 
Statistically significant when all data are used, 
while neither effect remains significant when 
the last four periods of data are considered." 
The ANOVAs also show that there is no dis- 
tinction between buyer reporting and seller 
reporting nor do the reporting patterns of 


buyers and sellers differ by mechanism. In par- ’ 


ticular, the buyer—seller main effect and interac- 
tion (Buyer—Seller * Mechanism) effect are not 
statistically significant for all agreements or for 
agreements in the last four periods. However, 
the truthful reporting measure varied across 
units and the pattern of the variation across units 
also differed by mechanism, i.e. the unit main ef- 
fect and interaction effect (Unit * Mechanism) 
are statistically significant for all agreements and 
agreements in the last four periods. Figure 7 pre- 
sents the reporting patterns for the three 
mechanisms for all 10 units. As illustrated in the 
figure, the means of absolute deviation differ by 





unit for each of the mechanisms and the pattern 
of the means differ across mechanisms.'° It can 


also be seen from the figure that the Ronen- 


McKinney mechanism has the smallest absolute 


‘deviation from true information for the first 7 


Mean absolute daviation® 





Unit 


Fig. 7. Truthful reporting by unit. “The mean absolute devia- 
tion is calculated by taking the mean of the absolute values of 
the differences between reported and true marginal revenue 
and marginal cost by unit for each of the three mechanisms. 
1, direct negotiation; 2, traditional; 3, Ronen-McKinney; @) 
traditional and Ronen-McKinney have same mean values. 


While there is little difference between buyer 
and seller reporting strategies in terms of abso- 
lute deviation, this measure does not provide 
any information about the systematic tendencies 


“When reporting is evaluated using all 10 units, the results ate qualitatively the same except the difference between the tradi- 
.tional and Ronen-McKinney mechanisms is no longer statistically significant both for all agreements and for agreements in 


the last four periods. 


This is reinforced by the result that both effects are statistically significant when the first four periods of data are used in 


the ANOVA. 


16An ANOVA with the same design as that described in Table 2 was run using onlythe units which were most frequently trans- 
ferred (i.e. units 4, 5, 6 and 7). The results of the ANOVA were qualitatively the same as the results of the ANOVA presented 


in Table 2. 
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of buyers or sellers to underreport vs overreport 
their true marginal revenue or marginal cost. To 


assess overreporting and underreporting, we. 
used the reported schedules to compute each . 


buyer's (seller’s) average ratio of reported mar- 
ginal revenue (marginal cost) to true marginal 
revenue (marginal cost) for the first 7 units on 
the schedule. A ratio greater than one indicates 


overstatement on average and a ratio less than’ 


one indicates understatement on average. A 
ratio of one indicates that players reported truth- 
fully on average. 

The means of the ratios for all buyers are 0.924 
0.960 and 0.959 for all agreements for the direct 
negotiation, traditional and Ronen-McKinney 
mechanisms, respectively. For agreements in the 
last four periods, the means of the ratios are 
0.942, 0.966 and 0.960, respectively. For all 
sellers, the means of the ratios for all agreements 
are 1.065, 1.055 and 1.023 for the direct nego- 
tiation, traditional and Ronen-McKinney 
mechanisms, respectively. For agreements in the 
last four periods for all sellers, the means of the 
ratios are 1.062, 1.050 and 1.014, respectively.!” 
For each mechanism, the buyer ratio is less than 
one and the seller ratio is greater than one and 
these differences from one are statistically sig- 
nificant using a two-tailed t-test (p-values are 
0.01 or less). Thus, buyers tend to underreport 
and sellers tend to overreport. These results are 
consistent with the observation that, on average, 
the transferred quantity was less than the opti- 
mal quantity. They also demonstrate that there is 
no apparent tendency toward any of the alterna- 
tive equilibria with the Ronen-McKinney 
mechanism where buyers overreport and sellers 
underreport, even though both parties’ profits 
can increase if they adopt these strategies. 


CONCLUDING REMARKS 
The surprising result of this laboratory experi- 


ment is how similar the transfer pricing 
mechanisms are with respect to performance, in 





particular the traditional and Ronen-McKinney 
mechanisms. When all periods of the experi- 
ment are considered, the direct negotiation 
mechanism is significantly less efficient than the 
other two mechanisms (due to a lower number 
of agreements ). However, the efficiencies of the 
traditional and Ronen-McKinney mechanisms 
are indistinguishable. In the last four periods of 
the experiment, the efficiencies of all three 
mechanisms are indistinguishable. 

While there is less misreporting with the 
Ronen-McKinney mechanism, the amount of > 
misreporting in the traditional mechanism is not 
very different from the Ronen-McKinney 
mechanism. For example, if the means of the 
absolute deviation from true information for ag- 
reements in the last four periods are used, the 
amount of misreporting under the traditional 
mechanism is only 3.4¢ greater than the amount 
of misreporting under the Ronen-McKinney 
mechanism. When all 10 units are considered, 
the traditional and Ronen-McKinney mechanisms 
are statistically indistinguishable. On the other 
hand, the amount of misreporting for the first 7 
units under the direct negotiation mechanism is 
8.6¢ greater than the amount of misreporting 
under the Ronen-McKinney mechanism even 
after we censored 20 observations where the 
amount of misreporting exceeded $1.00. Similar 
conclusions are reached when comparing the 
direct negotiation and Ronen-McKinney 
mechanisms using all 10 units. 

Thus, the above results suggest that the incen- 
tive compatibility issue may not be a critical one, 
at least in the environments considered here. 
However, this does not imply that the literature 
on incentive compatibility is the result of mis- 
placed emphasis. Indeed, none of the 
mechanisms examined here possesses a unique 
equilibrium which is incentive compatible; we 
also did not measure the performance of these 
mechanisms directly on their domain. Instead, 
we view our results as an initial attempt.to col- 
lect and evaluate empirical evidence about the 
performance of alternative transfer pricing 


"The relative rankings among the mechanisms by buyers and by sellers stay the same when all 10 units are used in the calcu- 


lations. 
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mechanisms in an interesting environment 
where the models should have some predictive 
ability. The fact that a mechanism, such as the 
traditional one considered here, performs well 
in this environment should not be considered 
Startling. The experimental evidence about the 
performance of public good mechanisms is 
replete with examples in which a mechanism 
with theoretically undesirable properties exhi- 
bited superior performance (see, for example, 
Ferejohn et al., 1982; Smith, 1977). 

Finally, we have demonstrated that laboratory 
experimentation provides a direct means for 
evaluating issues in transfer pricing. However, 
evaluating the performance of three alternative 
mechanisms represents only an initial step. Spec- 
ifically, in its most elementary form, a transfer 


pricing environment consists of a buyer and sel- 
ler who bargain over the price and units to be ex- 
changed and who are incompletely informed 
about each other’s valuations and costs, respec- 
tively. Thus, the transfer pricing problem shares 
many of the characteristics of a noncooperative 
bargaining process in which there is incomplete 
information (Cramton, 1984; Sobel & Takahashi, 
1983). Using recent advances in bargaining 
theory, existing transfer pricing models could be 
reformulated to explicitly incorporate incom- 
plete information. A well-controlied incomplete 
information transfer pricing experiment could 
then be conducted to explore the relationship 
between efficiency and different information 
conditions. 
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APPENDIX 


General instructions 

This is an experiment in the economics of decision-making. The instructions are simple and if you follow 
them carefully and make good decisions, you can earn a considerable amount of money which will be paid 
to you in cash at the end of the experiment. 

In this experiment you will be either a buyer or a seller. Your identification number at the top of this page 
tells you whether you are a buyer or a seller. 

Buying and selling will take place over a sequence of trading periods. All the rules described below apply 
equally to each trading period. In the instructions that follow, the value of any decisions you might make 
is described along with specific rules for trading and record-keeping. 


Specific instructions to buyers 
During each trading period you are free to purchase up to ten units from the seller. In your folder you 


will find a packet containing the Buyer’s Resale Value and Profit Schedule. The information provided in this 
schedule is for your private use. Do not reveal this information to anyone. On the Buyer’s Resale Value and 
Profit Schedule you will find a column labelled “Resale Value.” The entries in this column tell you the 
amount the experimenter will pay you for each unit that you purchase from the seller in a trading period. 
Although you will have a separate schedule for each trading period, your resale values will never change 
from one trading period to the next. 

Consider the sample Resale Value and Profit Schedule below: 


Resale Price per Profit per 
Unit Value($) Unit Unit 
a) (2) 6) (4)=(2)-(3) 
1 300 
2 200 
3 100 
Total profits for the period 


This schedule tells you that if you purchase 1 unit, your resale value for that unit you purchase is $300. If 
you purchase a second unit, your resale value on that unit is $200. 

During a trading period, you will have to come to an agreement with the seller over the number of units 
you will purchase and the price you will pay for each unit. The specific trading rules which you must use 
to come to an agreement will be described later in these instructions. Once an agreement is reached, you 
should go to column (3) of your Resale Value and Profit Schedule and record the price per unit that you 
have agreed to pay for units on each row corresponding to each unit you have agreed to purchase. Your 
profit on each unit you have purchased can then be determined by subtracting the price per unit you have 
agreed to pay from your resale value for each unit you have agreed to purchase. This is done by subtracting 
the entries in column (3) from the amounts given in column (2) for each unit you have purchased, and by 
recording this difference in column (4). Your total profit for the trading period is determined by adding up 
your profit per unit on each unit you have purchased and recording this amount on the last row of your 
Resale Value and Profit Schedule. 

Refer once again to your sample Resale Value and Profit Schedule above. Suppose you agree to buy 2 units 
from the seller for $100 each. You should first record this price on the first two rows of column (3) corres- 
ponding to the two units you have purchased. Your profit on the first unit you have purchased is (300 — 
100 =) $200 and your profit on the second unit you have purchased is (200 — 100 =) $100. These amounts 
should be recorded on the first two rows of column (4). To determine your profit for the trading period, 
add these amounts to obtain (200 + 100 =) $300, and record this in the space provided at the bottom of 
the schedule. When correctly filled out, your sample Resale Value and Profit Schedule should look like the 
following: f 
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oe ` Resale Price per Profit per 


Unit 7 Value ($) Unit ($) Unit ($) 
GQ) À (2) GY (4)=(2)-@) 
1 300 100 200 
2 200 100 100 
3 100 
Total profits for the period 300 


Obviously these figures are illustrative only and should not be assumed to apply in the actual experiment. 

At the end of each trading period, record your total profit for the period on your profit sheet provided 
in your folder. At the end of the experiment, add up your total profit from each trading period. The experi- 
menter will pay you this amount in cash. 


Specific instructions to sellers 

During each trading period you are free to sell up to ten units to the buyer. However, the units that you 
sell are not free. You must pay a production cost for units which you seil. In your folder you will find a 
packet containing the Seller's Production Cost and Profit Schedule. The information provided in this 
schedule is for your private use. Do not reveal this information to anyone. On the Seller’s Production Cost 
and Profit Schedule you will find a column labelled “Production Cost.” The entries in this column tell you 
the production cost you must pay for each unit you will sell in a trading period. Although you will have a 
separate schedule for each trading period, your production costs will never change from one trading period 
to the next. 

Consider the sample Production Cost and Profit Schedule below: 


Price per Production Profit per 
Unit Unit , Cost ( $) Unit 
GO) f @) B) (4)=(2)-(3) 
1 30 
2 80 
3 150 
Total profits for the period 


This schedule telis you that if you sell 1 unit, your cost for that unit you sell is $30. If you sell a second unit, 
the cost of selling that unitis $80. 

During a trading period, you will have to come to an agreement with the buyer over the number of units 
you will sell and the price you will receive for each unit. The specific trading rules which you must use to 
come to an agreement will be described later in these instructions. Once an agreement is reached, you 
should go to column (2) of your Production Cost and Profit Schedule'and record the price per unit you have 
agreed to sell Your profit on each unit you have sold can then be determined by subtracting your produc- 
tion cost for that unit you have agreed to sell from the price per unit you have agreed to receive. This is done 
by subtracting the amounts given in column (3) from the entries in column (2) for each unit you have sold, 
and by recording this difference in column (4). Your total profit for the trading period is determined by 
adding up your profit per unit for each unit you have sold and recording this amount on the last row of your 
Production Cost and Profit Schedule. l 

Refer once again to your sample Production Cost and Profit Schedule above. Suppose you agree to sell 
two units to the buyer for $100 each. You should first record this price on the first two rows of column (2) 
corresponding to the two units you have sold. Your profit on the first unit you have sold is (100 — 30 =) 
$70 and your profit on the second unit you have sold is (100 — 80 =) $20. These amounts should be 
recorded on the first two rows of column (4). To determine your profit for the trading period, add these 
amounts to obtain (70 + 20 =) $90, and record this in the space provided at the bottom of the schedule. 
When correctly filled out, your sample Production Cost and Profit Schedule should look like the following: 
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Resale Price per Profit per 
Unit Value ($) Unit ($) Unit ( $) 
a) (2) QB) (4)=(2)-(3) 
1 100 30 70 
2 100 80 .20 
3 150 
Total profits for the period 90 


Obviously these figures are illustrative only and should not be assumed to apply in the actual experiment. 

At the end of each period, record your total profit for the period on your profit sheet provided in our 
folder. At the end of the experiment, add up your total profit from each trading period. The experimenter 
will pay you this amount in cash. 


Trading rules — direct negotiation 

During each trading period, you are free to send messages to the other party with whom you are nego- 
tlating. In your folder you will find an ample supply of Offer Forms. If you are a buyer, you should write in 
the price per unit you are willing to pay for each given number of units on this form. For example, if you 
write in 50 next to 3 units, this means you are willing to buy 3 units at a price per unit of $50 for a total of 
(3 X 50 =) $150. You should write in all your purchase offers for ten units. If you are a seller, you should 
write in the price per unit you require to sell each given number of units on this form. For example, if you 
write in 100 next to 6 units, this means you are willing to sell 6 units at a price per unit of $100 for a total 
of (6 X 100 =) $600. You should write in all your sales offers for 10 units. When you finish recording these 
amounts, hand this form to the experimenter and he will deliver it to the other pary with whom you are 
negotiating. 

If you wish to accept an offer that has been made to you by the other party, circle the number of units 
you wish to transact. If you are a buyer and you accept a seller's offer, you must pay the seller the price per 
unit that the seller has recorded next to the number of units you have decided to purchase. If you area seller 
and you accept a buyer’s offer, the buyer will pay you the price per unit that the buyer has recorded next 
to the number of units you have decided to sell. When an offer bas been accepted, an agreement has been 
reached and the trading period is over. You should then compute your total profit for that trading period 
and record it on your profit sheet. 


Trading rules — traditional 

During each trading period, you should send your offers to the experimenter. In your folder you will find 
an ample supply of Offer Schedule Forms. If you are a buyer, you should write in the price per unit you are 
willing to pay for each given number of units on this form. For example, if you write in 50 next to 3 units, 
this means you are willing to buy 3 units at a price per unit of $50 for a total of (3 X 50 =) $150. You should 
write in all your purchase offers for ten units. If you are a seller, you should write in the price per unit you 
require to sell each given number of units on this form. For example, if you write in 1¢0 next to 6 units, this 
means you are willing to sell 6 units at a price per unit of $100 for a total of (6 X 100 =) $600. You should 
write in all your sales offers for 10 units. When you finish recording these amounts, hand this form to the 
experimenter. After both the buyer and the seller have submitted their forms, the experimenter will return 
your offer form to you with a suggested number of units to be transacted and the suggested price per unit 
you will pay if you are a buyer or the suggested price per unit you will receive if you are a seller. 

The experimenter will suggest a number of units to be transacted and a price per unit that you will pay 
or receive for these units as follows: 

(1) The suggested number of units to be transacted is the largest number of units for which the price the 
seller requires to sell these units is not greater than the price the buyer is willing to pay for these units. 

(2) The suggested price the buyer is asked to pay to the seller for these units will be determined by 
averaging the price the seller requires to sell these units and the price the buyer is willing to pay to purchase 
these units. The suggested selling price that is offered to the seller by the experimenter is the same as the 
suggested buying price that is offered to the buyer. 

To see how this will work, suppose the buyer gives the experimenter the follawing purchase offer 
schedule which specifies the price per unit that the buyer is willing to pay for a given number of units. Also 
suppose that the seller gives the experimenter the following sales offer schedule which specifies the price 
per unit that the seller requires to sell a given number of units. 
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Number of Buyer’s Purchase Seller's Sales . . 
. Units Offer Schedule ( $) ` Offer Schedule ( $) 
1 600. >- 500 
2 590 510 
3 570 520 
4 560 530 
5 545 540 
6 530 550 
7 510 560 
8 500 570 
9 480 580 

10 460 590 


With this information, the experimenter will suggest that 5 units be transacted since this is the largest 
number of units for which the price the seller requires to sell is not greater than the price the buyer is wil- 
ling to pay. 

The suggested price the buyer will be asked to pay to the seller for these units and the seller will be 
offered to sell these units is the average of the price the seller requires to sell five units (i.e. 540) and the 
price the buyer is willing to pay for five units (ie. 545). Thus, the suggested price per unit to both parties 
will be (540 + 5452 = 542.5. Obviously these figures are illustrative only and should not be assumed to 
apply in the actual experiment. 

After the experimenter gives you the suggested number of units to be transacted and the suggested price 
per unit you must pay or receive, you must indicate your decision of whether to accept or reject the sugges- 
sted number of units and the suggested price per unit. To do this, check the space marked “Accept” or 
“Reject” on the form the experimenter gives you, and hand it back to him. 

If both parties agree to the suggested number of units to be transacted and the suggested price per unit, 
the trading period is over. You should then compute your profit for that period and record it on your profit 
sheet. If either you or the other party does not agree with the suggested number of units and the suggested 
price per unit, you will be asked to send another schedule of prices to the experimenter and the above pro- 
cedure will be repeated. 


Trading rules — Ronen-McKinney 

During each trading period, you should send your offers to the experimenter. In your folder you will find 
an ample supply of Offer.Schedule Forms. If you are a buyer, you should write in the price per unit you are 
willing to pay for each given number of units on this form. For example, if you write in 50 next to 3 units, 
this means you are willing to buy 3 units at a price per unit of $50 for a total of (3 X 50 =) $150. You shquld 
write in all your purchase offers for ten units. If you are a seller, you should write in the price per unit you 
require to sell each given number of units on this form. For example, if you write in 100 next to 6 units, this 
means you are willing to sell 6 units at a price per unit of $100 for a total of (6'X 100 =) $600. You should 
write in all your sales offers for ten units. When you finish recording these amounts, hand this form to the 
experimenter. After both the buyer and the seller have submitted their forms, the experimenter will return 
your offer form to you with a suggested number of units to be transacted and the suggested price per unit 
you will receive if you are a seller. 

The experimenter will suggest a number of units i0 be transacted and a price per unit that you will pay 
or receive for these units as follows: 

(1) The suggested number of units to be transacted is the largest number of units for which the price the 
seller requires to sell these units is not greater than the price the buyer is willing to pay for these units. 

(2) The suggested price the buyer is asked to pay to the seller for these units will be determined by 
averaging all of the prices that the seller requires to sell all units up to and including the suggested number 
of units. 

(3) The suggested price the seller will be offered to sell these units will be determined by averaging all 
ofthe prices that the buyer is willing to pay for all units up to and including the suggested number of units. 

To see how this will work, suppose the buyer gives the experimenter the following purchase offer 
schedule which specifies the price per unit that the buyer is willing to pay for each given number of units. 
Also suppose that the seller gives the experimenter the following sales offer schedule which specifies the 
price per unit that the seller requires to sell each given number of units. 
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ALTERNATIVE TRANSFER PRICING MECHANISMS 


Number of Buyer's Purchase Seller's Sales 
Units Offer Schedule ( $) Offer Schedule ( $} 
1 600 500 
2 590 510 
3 570 520 
4 560 530 
5 545 j 540 
6 530 550 
7 510 560 
8 500 570 
9 480 580 
10 460 590 


With this information, the experimenter will suggest that 5 units be transacted since this is the largest 
number of units for which the price the seller requires to sell is not greater than the price the buyer is wil- 
ling to pay. 

The suggested price the buyer wili be asked to pay for these five units is the average of the prices the seller 
requires to sell 1, 2, 3, 4, and 5 units. That is, the buyer will be asked to pay a price of (500 + 510 + 520 + 
530 + 5405 = 520 per unit. Note that the price the buyer is asked to pay does not depend on the purchase 
offer schedule which the buyer has submitted. 

The suggested price the seller will be offered for selling these five units is the average of the prices the 
buyer is willing to pay for 1, 2, 3, 4 and 5 units. That is, the seller will be offered a price of (600 + 590 + 
570 + 560 + 5455 = 573 per unit. Note that the price the seller is offered does not depend on the sales 
offer schedule which the seller has submitted. Obviously these figures are illustrative only and should not 
be assumed to apply in the actual experiment. 

After the experimenter gives you the suggested number of units to be transacted and the suggested price 
per unit you must pay or receive, you must indicate your decision of whether to accept or reject the 
suggested number of units and the suggested price per unit. To do this, check the space marked “Accept” 
or “Reject” on the form the experimenter gives you, and hand it back to him. 

If both parties agree to the suggested number of units to be transacted and the suggested price per unit, 
the trading period is over. You should then compute your profit for that period and record it on your profit 
sheet. If either you or the other party does not agree with the suggested number of units and the suggested 
price per unit, you will be asked to send another schedule of prices to the experimenter and the above pro- 
cedure will be repeated. 


Final remarks 
Each trading period will last for 10 minutes. However, the time spent by the experimenter will not be 
counted. If you fail to come to an agreement in that time, your total profit for that period will be zero. 
There will be many trading periods and there are likely to be many cases of disagreement, but you are 
free to keep trying, and as a buyer or a seller you are free to make as much profit as you can. Are there any 
questions? 


Practice calculations for buyer (seller) 








1. Does the seller (buyer) know your resale value (production cost) schedule? Yes No 
2, Suppose you have the following Resale Value and Profit (Production Cost) Schedule: 
Resale Price per Production Profit per 
Unit Value ($) Unit ($) Cost per Unit ( $) Unit( $) 
KES (2) (3) (3a) 4=2—-3(4=3-3a) 
1 500 200 
2 450 250 
3 400 300 
4 350 350 
5 300 400 
6 250 450 
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If you can purchase (sell) units at a price of 375, what is the largest number of units you can purchase 
without incurring a negative profit or any unit purchased (sold? units, ` 
3. Suppose the experimenter receives the following schedule from the seller and schedule (1) from you: 





Seller's Buyer’s . Buyer’s 
Units Schedule Schedule (1) Schedule (2) 
1 100 360 350 
2 150 300 280 
3 200 240 210 
4 250 185 150 
5 300 160 100 
6 350 135 70 
7 400 . 110 50 
8 450 85 35 
9 500 60 25 
10 550 35 20 


How many units will the experimenter suggest that you should purchase from the seller? units, 
What suggested price per unit will the experimenter propose to you? $ per unit. 

What suggested price per unit will the experimenter propose to the seller? $ per unit. 

4. Suppose instead that you submitted schedule (2). Answer the same questions. 

3a. (To sellers ) Suppose the experimenter received the following schedule from the buyer and schedule 


(1) from you: 

















Seller’s _  Seller’s Buyer’s 
Units Schedule (1) Schedule (2) Schedule 
1 100 170 360 
2 150 200 300 
3 200 230 240 
4 250 280 185 
5 300 340 160 
6 350 410 135 
7 400 490 110 
8 450 - 580 85 
9 500 680 60 
10 550 i 790 35 
How many units will the experimenter suggest that you should sell to the buyer? $ units. 
What suggested price per unit wili the experimenter propose to you? $ per unit. 
What suggested price per unit will the experimenter propose to the buyer? $ per unit. 





4a. (To sellers) Suppose instead that you submitted schedule (2). Answer the same questions. 

5. Compare your answers to question 3 (3a) and question 4 (4a) and circle the correct answer in the 
following statements. 

The number of units I can purchase (sell) is higher, the same, lower in question 3 than in question 4. 
The price per unit I have to pay (will receive) is higher, the same, lower in question 3 than in question 4. 
The price per unit the seller (buyer) will receive (pay) is higher, the same, lower in question 3 than in 
question 4. 
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Abstract 


For a bankruptcy prediction problem, the judgment formation process is studied using linear models and 
process tracing models, The linear models are constructed using traditional linear discriminant analysis 
techniques. The process tracing models are constructed using computer-generated algorithmically-based 
decision nets. All the models presented show good predictive accuracy. However, the linear models and 
process tracing models diverge widely on several measures of cue importance. This divergence, for a fairly 
straightforward problem, is intriguing since virtually all the evidence in the accounting literature about cue 
importance is based on linear models research. The importance of different information cues to decision- 
makers is clearly a critical issue in the design of effective accounting information systems. Thus, this study 
suggests the need for much more careful attention to the complex question of assessing cue importance. 


So-called linear models of the judgment forma- 
tion process have been used frequently in study- 
ing the bankruptcy prediction problem (see 
Casey & Selling, 1986, and Ashton, 1984, for 
selected references). Thirty-two separate linear 
model studies of accounting-related tasks are 
also described by Libby (1983). Process tracing 
models of judgment formation are widely cited 
in accounting journals. A process tracing model 
is used in a recent paper by Bouwman et al. 
(1987). Benefits from simultaneously applying 
more than one model of the judgment formation 
process have been identifed by Einhorn et al. 


(1979) and Larcker & Lessig (1983). In those 
studies, statistically-derived linear models were , 


used simultaneously with process tracing mod- 
els (constructed from subjects’ verbalizations of 
their mental processes) to draw inferences 
about cue importance and decision making. 
Einhorn et al. (1979) undertook two experi- 
ments in which the ability of a linear regression 
model (linear mode!) to predict a subject’s judg- 
ments was compared to a Seas model de- 








veloped from the subject’s concurrent verbal 


protocol (process tracing model). The goal of 


their study was to discern a preferable approach 
to judgment modelling, but they concluded in- 
stead that the two models provide essentially the 
same information, albeit in different forms. Both 
models captured such basic elements of judg- 
ment as information search, information combi- 
nation and: feedback/learning. The models dif- 
fered primarily in the level of descriptive detail 
regarding these three elements. The authors also 
found that neither model was a superior predic- 
tor of judgments in the two task environments, 
and use of two or more models could provide 
more information about judgments than one 
model alone. Similar results from multiple mod- 
els is also useful in isolating the impact of 
“method variance” (Cook & Campbell, 1979, pp. 
59-70). 

Larcker & Lessig (1983) also found linear 


model predictions to be generally consistent 


with process tracing model predictions in a task 
where subjects made common stock buy/no-buy 


“Dedicated to the memory of Cornelius Casey, a colleague and friend. This paper could not have been written without his 


collaboration on an earlier project (Casey & Selling, 1986). 
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decisions on the basis of-six financial informa- 
tion cues. Their process tracing models were 
formulated from retrospective verbal introspec- 
tion and they proposed and utilized multiple 
measures of cue importance for application to 
the process models. For those subjects where 
cue importance results were not consistent be- 


tween the linear model and the process model, 


they demonstrated from inspection of the pro- 
cess tracing model that the judgment process 
was relatively complex and was not adequately 
captured by a “simple main effects” linear model. 

Although both of the above studies found both 


types of models to show high levels of predictive: 


ability, this does not lessen the need for exten- 
sions of research comparing such models. Al- 
though the previous studies have found linear 
and process tracing to be similar in the attribute 
of predictive ability, this should not be, in our 
opinion, the primary goal of judgment model- 
ling. We hold this view for two reasons. First, the 
predictive ability of linear models is well known 
as being robust with respect to deviations from 
linearity when the cues in the model have (or 
can be rescaled to have) a monotone relation- 
ship with the criterion variable. Under such con- 
ditions, linear models have been shown to be 
good predictors even when the underlying judg- 
ment formation process is known to be non- 
linear (Dawes & Corrigan, 1974). The robust- 


ness is also present when error in the measure- 


ment of the cues tend to make the relationship 
more linear (Dawes & Corrigan, 1974). These 
Statistical properties may well bias findings to- 
ward linear models being more highly predic- 
tive of subjects’ judgments than process tracing 
models. There may also exist some as yet unin- 
vestigated properties of the method by which 
process traces are constructed which would 
make their predictive ability similarly robust 
over broad classes of judgment settings. 

A second and even more important reason for 
not emphasizing predictive ability in problems 
involving the use of financial analysis is that it 
would be rare for a static model based on limited 
çue information (such as those used in virtually 
all the.extant accounting literature) to ever be 
substituted for human expert judgment. The 


usefulness of the models in practical situations 
lies much more in helping to identify important 
cues for the design of information systems and/ 
or expert systems. Models should exhibit good 
predictive ability of subjects’ judgments before 
they can be used to assess cue importance, but 
evaluating models of judgment based on their 
comparative predictive accuracy would be inad- 
visable. The cue utilization inferences have 
much greater potential importance. 

Methodological constraints of the two previ- 
ous studies limited their ability to infer differ- 
ences in cue importance measures. For Einhorn 
et al. (1979), the huge amount of data to be 
closely scrutinized from concurrent verbal pro- 
tocols evidently constrained the sample size to 
only one subject for each of the two experi- 
ments. For accounting studies utilizing concur- 
rent verbal protocol analysis, the sample size is 
typically fewer than ten (Biggs, 1979; Bhaskar & 
Dillard, 1979; Campbell, 1984). Larcker & Lessig 
(1983) reduced the data collection task by 
employing retrospective elicitation ofs, the 
thought process. This increased somewhat the 
practical sample size. However, such “retrospec> 
tive introspection” may only be a reliable reflec- 
tion of cognitive processing in a very restricted 
class of experimental settings (Nisbett & Wilson, 
1979; Ericcson & Simon, 1980). The data analy- 
sis limitations of concurrent verbal protocols 
and the potential limitations of retrospective 
elicitation are overcome in the present study by 
introducing an automated method of generating 
and analyzing concurrent process traces. 

The current study uses linear models and pro- 
cess tracing models simultaneously in a bank- 
ruptcy prediction context. The purpose of the 
study is threefold: using a familiar accounting in- 
formation problem, (1) to demonstrate that it is 
practical, using computer-based modelling, to 
generate large sample sizes for the process trac- 
ing view of the judgment formation process, (2) 
to compare the insights about cue importance 
derived from linear models and process tracing 
models,.and (3) based on the results of (2), to 
urge a broad re-evaluation of the evidence about 
cue importance in accounting contexts. We be- 
lieve that assessing the importance to decision- 
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makers of different accounting cues is still a very 
important subject. Up to this time, virtually all 
the evidence is from linear model studies. 


THE STUDY 


Task 

A recent study (Casey & Selling, 1986) inves- 
tigated the effect of task predictability and the 
disclosure of prior probability information on 
subjects’ ability to accurately predict bank- 
ruptcy as well as their ability to assess the proba- 
bility of their judgment being correct. The cur- 
rent study is based on additional data collected 
from the same laboratory experiment. 

In the study, 71 participants were divided ran- 
domly into six approximately equal treatment 
groups (three levels of prior probability disclo- 
sure crossed with two levels of task predictabil- 
ity). Each subject made dichotomous predic- 
tions of bankrupt/non-bankrupt for each of 30 
(15 bankrupt and 15 non-bankrupt) disguised, 
actual companies. The experiment took place in 
a personal computer laboratory using a compu- 
ter program developed specifically for the ex- 
periment. Subjects were instructed to select as 
few as one or as many as seven financial ratios in 
any sequence they wished for each firm and then 
to make their prediction. The order of the cues 
chosen and the prediction constitute the data to 
be analyzed for each subject in this study. This 
method of data collection closely resembles the 
information board technique first used by Payne 
(1976). The main difference here is that all as- 
pects of the experimental task were automated. 
This greatly reduces the level of experimenter 
involvement, which makes larger sample sizes 
possible. In a following section we describe a 
new method for analyzing the data. 


Subjects 


Subjects were second-year MBA students who 





volunteered to participate in the study. Partici- 
pation was motivated in part by the announce- 
ment that prizes would be awarded based on 


:task performance. There was no experimental 


attrition of subjects and all subjects requested 
that they be sent a detailed report of the study’s 
findings upon its completion. The average sub- 
ject had 1.03 years of accounting and financial 
analysis. The current study used 69 of the 71 
subjects because a data handling error caused 
two records to be erased. 


Models of the judgment formation process 


A traditional linear model. Multiple discri- 
minant analysis was used to develop two linear 
models of judgment for each subject. The first 
model was based on actual values for the five 
cues most frequently referenced by the subject. 
It was not possible to use all seven cues because 
of problems with the mathematics of discrimin- 
ant analysis. Cooley & Lohnes (1977, pp. 243— 
250) point out that unless the number of obser- 
vations in each treatment group,’ minus one, €x- 
ceeds the number of variables, the discriminant 
function cannot be estimated. Further, as the 
number of observations in each treatment group 
increases in relation to the number of variables 
in the model, so does the reliability of the 
parameter estimates. Using the five cues most 
frequently referenced by the subject, we were 
able to construct discriminant functions for 63 
of the 69 subjects. Fewer than 1% of the judg- 
ments made by these 63 subjects used more than 
five cues. We term this approach a “traditional” 
linear model. 


A modified linear model. One of the criti- 
cisms of linear models as representations of cog- 
nitive processing is the presumption that all cues 
are attended to by each subject on each trial. 
Deviations from this assumption are frequently 
observed in laboratory experiments, including 


"The term treatment group refers here to the judgments made. In the current study there are 30 trials for which there are 
two judgments, “fail” or “survive”. Thus, for each subject, there are two treatment groups which can range in size from 0 to 
30, depending on the ratio of “fail” to “survive” judgments. To estimate a model with seven variables requires at least nine 
observations for each of the two treatment groups. A model with five variables requires only seven observations in each 


group. 
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this one. All cue values for a given trial are used 
in the traditional approach regardless of 

_ whether all cues were actually referenced by the 
subject or not. Since we knew exactly which 
cues were attended to by each subject on each 
trial, it was possible to develop a different ver- 
sion of the linear model based on a different pre- 
sumption. Specifically, ifa subject did not attend 
to a cue on a particular trial the actual value for 
that cue was replaced by an average value. This 
presumption is based on Bouwman’s (1984) 
hypothesis that financial analysts have stored in 
memory a prototypical firm which is used as an 
anchor in analyzing the financial data of actual 
firms. The second linear discriminant function 
we created for each subject used, for any cues 
not referenced in a trial, the average value for the 
cue over the trials in which it was referenced. 
This second model can be viewed as a refine- 
ment capturing one interpretation of the con- 
cept of the unobservable prototype. Our object- 
ive here is to increase the chances of finding con- 
vergence between linear and process tracing 
models on the cue importance dimension. If the 
reader feels the conventional use of actual values 
for unattended to cues is preferable, the revised 
linear model can be disregarded. 


Process tracing models. In Larcker & Lessig 
(1983) and Swinth et al. (1975) the process 
tracing models which are developed for each 
subject take the form of decision nets. In those 
studies the data inputs to the models were retro- 
spective verbalizations. In the current study, a 
process tracing model was constructed for each 
subject using as data inputs the concurrent cue 
usage patterns recorded by the computer. A 
second difference here is that we generated the 
process tracings using a computerized algo- 
rithm, following Newell & Simon (1972), Earlier 
studies used either manual scoring rules to con- 
struct the process tracings (Einhorn et al. 1979; 
Biggs, 1979). The computerized algorithm 


employed in this study operates completely | 


without experimenter intervéntion. In contrast 
to the previous process tracing studies cited, all 
subjective judgment in constructing the process 
tracings is completely specified by the algorithm 


which generates them. In this respect, the pro- 
cess tracings here all have the property of being 
totally consistent with respect to the subjective 
judgments on which they are ostensibly based. 

The decision net we created for each subject 
takes the form of a binary tree — a series of 
nodes each branching to two subsequent nodes. 
Nodes in a net represent one of the seven infor- 
mation cues or a judgment [fail (F) or survive 
(S)}. 

An intuitive description of the algorithm is as 
follows: 

(1) The subject has a favorite cue which is typ- 
ically referenced first. 

(2) Based on the observed value of this cue, 
the subject will next reference a second cue. 
Which cue is referenced second depends on the 
value of the first cue. 

(3) Based on the observed value of this sec- 
ond cue, the subject will next reference a third 
cue. Which cue is referenced third depends on 
the value of the second cue. 

(4) At some point in this process, the subject 
truncates the search and makes a judgment (suc- 
cessful firm or failed firm). 

A more formal description of the algorithm is as 
follows: 

(1) The node at the top of the decision net was 
selected as the information cue the subject most 
freuently referenced first in the 30 trials. 

(2) This node branches either to the informa- 
tion cue referenced most frequently after the 
first cue or to a judgment. 

(3) The branches continues until both nodes 
represent judgments which indicates no further 
information processing. 

(4) The two branch nodes at each stage of the 
process are the two most frequent responses fol- 
lowing the prior response. 

(5) Branching beneath an information cue re- 


quires a rule as to whether to branch left or right 


and a positioning rule for which branch node 
goes on the left and the right. 

(6) The branching rule is simple — branch left 
if the cue value is below a pre-determined criti- 
cal value, otherwise branch right. This rule re- 
quires the calculation of the pre-determined 
critical value for the cue. 


TWO APPROACHES TO JUDGMENT MODELLING 69 


(7) The critical value is the simple average of 
two scores — one for each of the two branch 
nodes. 


(8) This score for each of the two branch ` 


nodes is just the simple average of the prior node 
cue value over the trials when that cue precedes 
the branch cue. 

(9) The positioning rule for the two branches 
is also simple — the one with the lower score 
goes on the left. 

(10) Since all seven of the information cues 
are scaled with lower values indicating poorer 
financial health, right branches indicate a suc- 
cess judgment or further processing; left 
branches indicate a failed judgment or further 
processing. 

Since the algorithm is complicated, we will 
recapitulate the steps here in a manner which 
resembles the program operation: 

(1) Choose a first cue based on usage fre- 
quency. 

(2) Then choose as the two branch nodes the 
two most frequent responses that follow this 
cue. 

(3) For each of these two. branches, calculate 
the average value of the prior cue over the trials 
when it does in fact precede that branch cue. 

(4) Average these two average values to deter- 
mine the critical value for the prior cue. 

(5) Position the two branch nodes so that the 
one with the lower score is on the left. 

(6) If the value of the prior cue is below the 
critical value, branch left, otherwise, branch 
right. 

(7) For any branch node which is an informa- 
tion cue, return to step 2 and iterate through 
step 6. 

(8) Terminate processing when both branch 
nodes are judgments. 

Although the choice of a binary structure for 
the decision nets is in one sense arbitrary, we be- 
lieve it is appropriate for the following reasons: 

(1) The notion of decision trees has been 
around a long time, and is surely taught in every 
MBA program which offers a course in decision 
analysis. Given this paradigm, the binary net is as 
simple a version as can be constructed, just as an 
additive linear model is the simplest of compen- 


satory non-contingent models which can be 
constructed. 

(2) Our results show that application of the 
model predicts well, sa we have no ex post rea- . 
son to reject it as a reasonable paradigm. 

(3) We suspect that Gost nodes of the Larcker 
& Lessig decision nets can be reduced to a binary 
structure. The example in their paper, for in- 
stance, can be rewritten to have six binary node 
choices and odiy one non-binary choice. In addi- 
tion, one weakness of their method of retrospec- 
tive elicitation is that a “demand effect” may 
cause subjects to state 2 more complex decision 
process than they actually used. This potential 
upward bias on the number of alternatives out of 
a node and the fact that most alternatives were 
actually binary, indicate to us that our use of 

binary nets was justified. 


‘One way of viewing the choice of this ponit 


is that it is a judgment with the advantage of 
guaranteeing consistency but the disadvantage 
of not being flexible. 

Figure 1 is a schematic representation of the 
decision net for one of the subjects. This model 
of the underlying cognitive process is clearly 
much different from the linear model and we be- 
lieve it is equally plausible to the linear model, 


ex ante. 
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Fig. t. A sample decision net generated from the concurrent 
process trace for one subject. 
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Cue importance measures 
Cue importance measures from the two linear 
models were calculated for each subject based 


on the ranking of the absolute value of the stan- ` 


dardized coefficient estimates for the five cues. 
These measures of cue importance depend on 
. the absence of a high level of multi-co-linearity 
among the cues across the 30 firms. The two 
relevant correlation matrices (corresponding to 


the two different sets of 30 firms) for the cues 


used in the study yielded 12 significant correla- 
tion coefficients at the 0.05 level out of a total 
maximum possible of 84. Multi-co-linearity was 
also measured as the number of the 14 coeffi- 
cients of determination (seven times the two dif- 
ferent sets of firms) resulting from regressions of 
each of the seven independent variables on the 
remaining six variables. Half of the computed 
coefficients of determination were significant at 
the 0.05 level and the median value of the coeffi- 
cient was 0.41. The range of values of the coeffi- 
cient was 0.10--0.65. Based on this evidence, and 
the robustness of the model for even moderate 
levels of co-linearity, the level of multi-co-linear- 
ity was deemed low enough that it should not 


pose a serious problem in the interpretation of | 


the results. 

Nine different measures of cue importance 
from the process tracing model for each subject 
were derived following Larcker & Lessig (1983). 
One of the contributions of that paper was the 
rich discussion of alternative measures of cue 
importance. To summarize from their paper, the 
nine measures are developed from the factorial 
combinations of two factors, each at three levels. 
The first factor is one of three measures of cue 
importance, denoted alphabetically as follows: 
(A) depth of the cue in the net, where the 
deepest cues are considered to be the least im- 
portant; (B) frequency of appearance in the net 
(the percentage of paths in which the cue ap- 
pears); and (C) frequency of appearance as the 
final cue before truncation (the percentage of 
decisions made at this cue). The second factor 
relates to the type of path in the net in which a 
cue is found, denoted numerically as follows: (1) 
a path which leads to a judgment of “fail”; (2) a 
path which leads to a judgment of “succeed”; and 


(3) no distinction between the terminating 
judgments. Our nine possible measures of cue 
importance are labeled A-1, A-2,... and C-3. 


RESULTS 


Predictive ability 

We will begin our discussion of the results of 
the study with the predictive accuracy issue. 
This is clearly one measure of the usefulness of a 
model of the judgment formation process. How- 
ever, unless the model is-intended to be substi- 
tuted for human processing, the more relevant 
measure of usefulness is the insight about the 
cue importance. Designers of accounting infor- 
mation systems rarely intend for the system to 
replace managerial judgment, but they do need 
to know what information is most useful for 
managers. We thus see alternative ways of view- 
ing cue importance as a much bigger issue for 
accounting researchers than predictive accu- 
racy. However, discussion of alternative cue im- 
portance measures obviously presupposes that 
the models generating the measures can suc- 
cessfully represent the variation in judgments by 
the subject across a set of experimental trials. 
Predictive accuracy is thus `a necessary condi- 
tion but not a sufficient one. 

As described above, the study uses three mod- 
els of the judgment formation process: (1) a 
“traditional” linear model; (2) a “modified” 
linear model substituting an average value for 
the actual value for omitted cues; and (3) a pro- 
cess tracing model in the form of a decision net 
constructed using a computerized algorithm. 
This section will compare the ability of these 
three models to replicate or predict the subjects’ 
actual judgments. 

Predictive ability for all three models was 
measured as the percentage of times the model 
predicted judgments correctly. For linear mod- 
els, predictions were developed using a jack- 
knifing procedure similar to Larcker & Lessig, 
(1983). In this procedure each trial is predicted 


. based on a model built using the other 29 trials. 


This procedure eliminates the bias that results if 
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‘the model is used to predict the same trials that 
are used to build it. 


In the case of the process tracing models, we 
chose instead to use the full model for predic- 
tion purposes and accept the bias which results 
when the trial being predicted is itself one of the 


thirty observations used to build the model. This . 


choice was based on the significant amounts of 
computer time and programming time required 
for the jack-knifing procedure with the process 
tracing models. . 


Table 1 presents the percentage of times each 
model predicted judgments correctly for each of 
the 63 subjects. The table also presents the mean 
predictive ability for each of the three models 
and the corresponding standard deviation. All 
three models predicted significantly better than 
random (alpha less than 0.01). As expected, 
because of the mathematical properties of the 
underlying data, both linear models were sig- 
nificantly better predictors than the process 
tracing (alpha less than 0.001). The modified 
linear model with averages substituted for actual 
values predicted significantly better than the 
“traditional” linear model (alpha less than 
0.001). 


Since the experimental task was constructed 
to cross two levels of information content with 
three levels of prior probability disclosure, an 
ANOVA. was performed for each of the three 
judgment models to investigate whether the 
treatment factors systematically affected predic- 
tive ability of the models. The results were nega- 
tive with one exception. The only significant 
main or interaction effect was information con- 
tent for the “traditional” linear model (alpha less 
than 0.02). Scheffe’s post boc test for this main 
effect indicated that the model predicted sig- 
nificantly better at the high information content 
level. 


Our general conclusion here is that all three of 
the models are sufficiently good predictors of 
the subjects’ judgments, in terms of “hit rates” 
observed in prior studies, so that we can move 
on to the more interesting question to us — 
comparative measures of cue importance. 


Cue importance 

Our next step was to see whether cue impor- 
tance measures generated from the process trac- 
ing models were correlated with their counter- . 
parts from the linear models. We chose to focus 
on rank-order correlation across the five cues to 
avoid parametric assumptions about the under- 
lying distributions. We would expect a high 
degree of intercorrelation across the various 
models for a given subject. That is, ex ante, we 
see no reason to believe a highly ranked cue 
from a linear model would not also be highly 
ranked in at least some of the cue importance 
measures from a process tracing model. 

Spearman rank-order correlation coefficients 
were used. For each of 63 subjects and for each 
of the two linear models, nine Spearman rank- 
correlation coefficients were calculated. Each of 
the coefficients compares the ranking of the five 
information cues under one of the measures of 
cue importance from the process tracing model 
vs the ranking of cue importance from the linear 
model. The result here is nine rank-order corre- 
lation coefficients (across the five information 
cues) for each subject for the process tracing 


-model vs each linear model. Since two different 


linear models were used, there are two sets of 
nine rank-order correlation coefficients for each 
subject. Table 2 presents, first, the average for 
each of the nine correlation coefficients over the 
63 subjects for both of the linear models. 
Second, the table presents a count of the number 
of correlation coefficients (out of 63 possible) 
which were significantly different from zero at 
the one-tailed probability of 0.05. Third, the 
table presents a count of the number of correla- 
tion coefficients with a sign which agrees with 
the one-directional test that the individual cor- 
relation is significantly different from zero. This 
column of statistics also shows the correspond- 
ing binomial tests as to whether the frequency of 
correspondence in sign of the statistic with the 
direction of the hypothesis is non-random. This 
column is reported in order to give the data as 


` good a chance as we can think of to provide evi- 


dence for convergence of methods on cue im- 
portance measures. Note that the number of sig- 
nificant results in the fourth column is equal to 
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TABLE 2. Rank-order correlations between measures of 








cue importance 

Linear Number 

model Number signs agreeing 

coefficient average Significant with the 

vs correlation (p= 0.05) hypothesis S.D. 

Traditional linear modei vs the process tracing model 

; A-1 —0.031 6 39 0.420 

A-2 ~0.034 6 39 0415 
A3 —0.033 6 39 0.414 
B-1 —0.196* 17 56t 0.607 
B-2 0.204* 17 Slt 0.570 
B-3 0.209* 16 56t 0.600 
Cl 0.052 9 43+ 0.490 
C2 0.025 7 31 0.440 
C3 0.039 7 39t 0.494 

Modified linear model vs the process tracing model 
A-1 0.085 2 32 0.418 
A-2 0.073 2 33 0.412 
A3 - 0.074 2 32 0.410 
BI 0.428* 17 42t 0.438 
B-2 0.365* 16 43t 0.444 
B-3 0.420° 17 44+ 0.434 
Cl 0.240* 1k 32 0.397 
C-2 —0.036 2 27 0.474 
C3 0.170° 9 35 0.437 


* Significant t-test at the p = 0.05 level. 
+ Significant binomial test with equal probabilities at the p = 0.05 level 


the number of significant average correlation 
coefficients in the second column. Furthermore 


there is very little variation across the columns - 


as to specific cue importance measures which 
show convergence. 

As with the predictive ability results, it is 
important to test whether the experimental var- 
iables of information content and feedback sys- 
tematically affected the mean Spearman correla- 
tion coefficients. The independent variables are 
the nine measures of cue importance in the deci- 
sion nets across the two types of linear models. 
For this test, 54 F-statistics were computed cor- 
responding to the 18 independent variables 
times the three types of effects (the two main ef- 
fects and their interaction). Only four of the 54 
F-statistics were significant at the 0.05 level. The 
lack of convergence in our sample is thus not in- 
duced by the treatment effects used in the study. 

The null hypothesis of zero average correla- 
tion against the alternative hypothesis of nega- 


tive correlation for the three cue depth mea- 
sures (A-1, A-2, A-3) cannot be rejected at p = 
0.05 for either linear model. A negative correla- 
tion is posited because cue depth is inversely re- 
lated to cue importance. The number of indi- 
vidual correlations which are significantly less 
than zero is slightly greater than the number that 
would be expected to occur by chance (0.05 Xx 
63 = 3.15) for the “traditional” linear model but- 
not for the “modified” linear model. 

For the three measures based on percentage 
of paths containing the cue (B-1, B-2 and B-3), 
the null hypothesis of zero average correlation 
vs the alternate hypothesis of positive average 
correlation can be rejected at the 0.05 level for 
both linear models. However, the evidence here 
is still quite weak when the number of individual 
correlations that are significant at the 0.05 level 
is considered. The maximum observed value 
here is only 17 out of 63. The evidence is also 
mixed and quite weak for the three “C” variables 
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(cue referenced at the decision point). Only two 
of the six average correlations are significantly 
‘different from zero at the p = 0.05 level, and the 
number of individual correlations significantly 
greater than zero ranges only from 2 to 11 out of 
63. 

Our general conclusion here is that there is 
very little convergence on measures of cue im- 
portance between the process tracing model 
and either of the two linear models. Obviously, 
there are many different ways to assess the im- 
portance of an information cue. We have 
examined here only four measures — three 
taken from process tracing models and one 
taken from linear models. Linear models really 
only permit one measure of cue importance, the 
standardized coefficient value in the discrimin- 
ant function. Process tracing models permit 
many different measures of cue importance, of 
which we have examined three. 

We will turn now to a discussion of the signifi- 
cance of our experimental results. 


DISCUSSION AND RESULTING FURTHER TESTS 


The inconsistent correlational results be- 
tween cue importance measures from linear 
models and process tracing models in the pre- 
sent study strongly suggest that sole reliance on 
the linear model to measure cue importance 
could be very misleading. Our results are not in- 
consistent with previous research. Larcker & 
Lessig (1983) also found lack of convergence, 
but their study was not designed to test formally 
for the extent of convergence. , 

Stated most succinctly, most of what we cur- 
rently believe about cue importance is based on 
linear models research. In this study the infer- 
ences about relative cue importance from linear 
models are dramatically at variance with infer- 
ences drawn from process tracing models that 
are also accurate predictors of subjects’ judg- 
ments and seem equally plausible, ex ante. Why 
does the magnitude of the coefficient score in a 
discriminant model correlate so poorly with 
such other measures of cue importance as first 
cue used, last cue used or frequency of cue use? 
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We are not prepared to answer this question 
based on the current study, but the evidence 
here suggests strongly that it now must become 
a relevant question for accounting researchers. 
Because confidence about the comparative 
importance of different cues is critical in 
designing accounting information systems, the 
divergence observed here must be viewed as 
troubling. Resolving the divergence, subject-by- 
subject, may involve using multiple models 
simultaneously. For example, Larcker & Lessig 
(1983) conjectured that non-convergence of 
cue importance measures would tend to occur 
for subjects with more complex decision nets. 
In the present study, this conjecture was for- 
mally tested. First, for each subject, the number 
of paths in the decision net was calculated. More 
paths means a more complex net. This number 
was then correlated across the 63 subjects with 
each one of the 18 Spearman rank correlation 
coefficients calculated earlier and summarized 
in Table 2. The result is 18 rank-order correla- 
tion coefficients arranged in Table 3. The hypo- 
-thesis here is that more complex decision nets 
will produce poorer correlation between cue 


TABLE 3. Correlation of number of paths in decision nets 

with rank-order correlations between measures of cue 

importance for the linear models and the decision nets (table 
values are Spearman rank-order correlation coefficients) 








Decision net cue Linear models 

importance measure “Traditional” “Modified” 
A-1 0.018 0.024 
A-2 0.014 0.043 
A-3 0.019 0.039 
B-1 —0.065 0,022 
B-2 —0.055 —0.024 
B-3 —0.090 —0.045 
C-1 —0.192 -0.263 
C-2 —0.128 —0.059 
C3 —0.250 


0.262 





importance measures in the linear model and 
process tracing model Table 3 presents the re- 
sults of these correlations. The Larcker & Lessig 
hypothesis would imply negative values for the 
B and C measures and positive values for the in- 
versely related A measures. All of the 18 re- 
ported correlations have the appropriate sign, 
but not even one of the 18 is significantly diffe- 
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rent from zero at the 0.05 level. We interpret this 
result to mean that linear models and process 
tracing models can diverge in terms of ranking of 
cue importance even when process tracing 
models yield relatively simple decision nets. Cue 
importance in the simpler nets is still different 
from ‘coefficient scores in the linear model. In 
the current study, it is not just the complexity of 
the decision net that causes the poor correlation 
on cue importance measures. 

Another possible method for resolving con- 
flicting cue importance measures is to resort to 
a third method of judgment modeling. In the 
present study, subjects were asked to supply 
demographic data and to write a short paragraph 
which summarized their method of judgment 
formation after they completed the experimen- 
tal task. Nearly all subjects were able to sum- 
marize their judgment formation process in a 
few sentences. The responses to the request for 
retrospective introspection were ‘sorted into 
two groups. One group was classified as display- 
ing strong evidence of a contingent judgment 
formation process similar to the decision net 
idea. The other group contained the responses 
independently. There was 91% agreement be- 
tween the raters. All differences between the rat- 


ets were resolved by classification in the cate- ` 


gory of not displaying strong evidence of a con- 
tingent judgment formation process. A contin- 
gent process was strongly indicated for 35 of the 
63 subjects. 

A testable hypothesis here is that convergence 
on cue importance will be less good for those 
persons clearly categorized as contingent pro- 
cessors. For these subjects, cue importance mea- 
sures generated from the process tracings might 
diverge from those generated by linear models 
because a contingent process ‘is more overtly 
represented. Table 4 displays the same type of 
calcuations as Table 3 with the exception that 
the “contingent process” (coded as “1”) vs 
“other” (coded as “0” ) classification replaces the 
number of paths in the decision net. The ex- 
pected signs of the coefficients would again be 
positive for the A measures and negative for the 
B and C measures. The observed signs are not 
consistent with expectations and none of the 


TABLE 4. Correlation between processing “style” and rank- 

order correlations between measures of cue importance for 

the linear models and the decision nets (table values are’ 
Spearman rank-order correlation coefficients) 








Decision net cue Linear models 

importance measure “Traditional” “Modified” 
A-1 0.203 0.193 
A-2 0.202 0.176 
A3 0.203 0.180 
B-1 —0.066 0.114 
B-2 ~0.062 0.109 
B-3 —0.077 0.093 
C-i 0.146 0.296 
C-2 0.129 —0.063 
C3 0.204 0.175 





Note: The process traces classified as reflective of a contin- 
gent process were coded as “1”; all others were coded as “0”. 


measures is significantly different from zero at 


the 0.05 level. 

Our general conclusion here is that the di- 
vergence in this study is also not clearly related 
to the subjects’ perceptions of their use of con- 
tingent processing. 


INTERPRETATION — LEAN VS RICH DOMAINS 


The utillization of linear models and process 
tracing models in this study as well as in other 
studies cited, has been restricted to “lean 
domains” (Bhaskar & Simon, 1977). In a “lean 
domain” problem, the cue set available to the 
subject is fixed and known. The problems can be 
very complex (chess, for example), but they are 
never as complex as “rich domain” problems in 
which the subjects draw upon unspecified data- 
bases, either stored in memory or available upon 
request, to solve a problem or form a judgment. 
Two further “lean” dimensions of all the studies 
cited here are: (1) they all present a large num- 
ber of cases to the subject in a short amount of 
time, and (2) there exists little evidence that the 
subject is highly committed to performing the 
experimental task as well as possible. An in- 
teresting question yet to be addressed is the 
comparability and validity of linear models and 
process training models for rich domain judg- 
ments. 
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For example, consider the decision of grant- 
. ing a bank loan in an environment where, instead 
‘of seven numerical cues, 70 or 700 cues — not 
all numerical — are available. Arguably, the deci- 
sion process of bank lending officers functioning 
in this rich domain may be highly contingent 
upon certain key aspects of the lending environ- 
ment, as depicted in Table 5. A linear model 
could not adequately capture the nature of the 
decision process. Decision nets, on the other 
hand, may do a much better job of representing 
contingent processes which are likely to be pre- 
sent in rich domains. One undemonstrated ad- - 
vantage of the automated method used in this 
paper for collecting and analyzing the process 
traces is that it could be applied with greater ef- 
fectiveness than linear models in problems in-' 
volving extensive and multidimensional cue 
sets. i 


TABLE 5. Contingent model of loan officer decision process 








Money conditions 
“Easy” “Tight” 
Perform a cursory The purpose of the 
examimation to make examination is to scale 
sure nothing is down the amount 
Existing clearly out of line, requested to 2 
customer thengranttheloan minimum. The loan is 
i (whichiswhatyou very likely to be 
expected to do granted, but the 
originally anyway). amount maybe _ 
: reduced. 
Perform an exam- Perform an extensive | 
mination to decide analytic review of the 
whether the new potential customer 
Potential customer is worth and the loan amount. 
new cultivating. Ifso, 
customer analysis of the loan 


request itself may well 
be cursory. 





SUMMARY 


Because this paper is positioned counter to 
much of the existing literature on human infor- 
mation processing in judgments using account- 
ing information, we believe it will be useful to 
close by briefly summarizing the argument and 
the analysis. ` 

A great many studies in the accounting litera- 
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ture have used linear regression or discriminant 
analysis to model human information processing 
and have shown very high levels of predictive 
accuracy. Much work in psychology has shown 
that the very high predictive power of such 
linear models is largely an artifact of the 
mathematics. There is really no evidence that 
such models measure human information pro-' 
cessing. This realization has led to a new way of 
interpreting linear models research in account- 
ing — the models are said to give evidence about 
the differing importance of various information 
cues, even if they do not give evidence of how 


. such cues are really aggregated in decision mak- 


ing. 

Some recent work has presented ar alterna- 
tive methodology for studying human informa- 
tion processing — the so-called process tracing 
approach. Two papers have tried to compare 
and contrast linear models with process tracing 
models. This research is hampered by the severe 
data analysis problems of the process tracing 
approach as well as by conceptual problems of 
interpreting what are always, at best, partial 
traces of how subjects use information to form, 
judgments. The two studies comparing linear 
and process tracing approaches have concluded | 
that the two approaches yield generally compar- : 
able insights and comparable levels of predictive 
ability. Since the predictive ability results could 
be explainable as mathematical quirks, the prim- 
ary inference from the studies is the evidence 
that the two approaches yield comparable evi- 
dence about cue utilization and cue importance. 
If it were true that linear models yield similar in- 
sights about cue importance to process tracing 
models, then the dramatically lower level of 
methodological problems involved with linear 
models. would indicate that process tracing 
models are probably not worth the effort. 

Einhorn et al. focussed their paper on the con- , 
vergence of the two modelling approaches. 
They did not pursue the extent of lack of con- 
vergence. In fact, their sample size was so small 
that no testing of convergence was possible any- 
way. Larcker & Lessig did find non-convergence 
of the two approaches. We have a sample large 
enough to permit some testing of the issues 
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about which they could only conjecture. 

In the current study, linear models and pro- 
cess tracing models are developed for a bank- 
ruptcy prediction experiment. All the models 
show good predictive accuracy, so this aspect of 
“significance” can be deemed to be controlled 
across the models. However, the process tracing 
‘models and linear models show very little con- 
vergence on nine different measures of cue im- 
portance. Since one role of accounting research 
is to study how accounting information is used 
by decision makers, studies of cue importance 
are critical in evaluating the efficacy of various 
accounting systems. The current study de- 
monstrates that process tracing studies can yield 
much different insights about cue utilization 
from linear models studies. The current study 
also presents a way of generating process traces 


which avoids some of the major pitfalls of earlier 
process tracing studies. 


Given the methodological refinements in this 
study, the fact that it yields cue importance re- 
sults which are not comparable to those of the 
linear models, and the possibility of adapting the 
methodology here to more complex and realis- 
tic problem contexts, there is a strong sugges- 
tion that future reserach should further probe 
the question of cue importance to decision- 
makers by means of automated process tracing 
approaches. Our primary conclusion is that cue 
importance as measured by linear models coeffi- 
cients is not highly correlated in our sample with 
other measures of cue importance derived from 
Larcker & Lessig. Assessing cue importance re- 
mains a very important open topic. 
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INTRODUCTION 


THEODORE J. MOCK 
Center for Accounting Research, University of Southern California 


This special section of Accounting, 
Organizations and Soctety contains a selection 
of the papers presented at the 1987 University of 
Southern California and Deloitte Haskins & Sells 
Audit Judgment Symposium. The Symposium 
was the fifth of a series held at USC which have 
considered various aspects of behavioral, 
judgmental and decision support research into 
auditing. Traditionally, the Symposium has 
chosen not to publish its papers or proceedings. 
On this occasion, however, it seemed 
appropriate to bring at least some of the papers 
to the attention of a wider audience because the 
Call for Papers for the Fifth Symposium 
specifically requested submissions which could 
critically consider both research areas where 
progress had been made and areas that had 
resulted in blind avenues. In addition, a number 
of specific research studies were submitted to 
and presented at the Symposium. The papers 
published in this section are a subset of those 
presented at the Symposium which the authors 
wished to be considered for publication and 
which met the review standards of the journal. 
Over the past five years the Audit Judgment 
Symposium has benefited from a number of 
sources of support that I would like to 
. acknowledge. Financial support for the 
` Symposium has been provided by the Deloitte 
Haskins & Sells Foundation and by the Center for 
Accounting Research of the University of 
Southern California. The counsel and help of 
Ward Edwards of USC’s Social Science Research 
Institute along with the financial support of 
Deloitte Haskins & Sells has been particularly 
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important in enabling us to make the Symposia 
interdisciplinary in nature, attracting the 
participation of outstanding scholars from 
psychology, behavioral decision theory and 
artificial intelligence. The Symposium has 
traditionally been held during the same week as 
Ward Edward’s Bayesian Conference which 
celebrated its 25th Anniversary in 1987. 

In addition to financial support, Deloitte 
Haskins & Sells has provided important input 
into the Symposium through its Foundation 
Presidents (Bill Cole and Gerald Sena) and 
through participation on various panels, 
particularly by John Ellingsen. 

In addition to Ward Edwards, the Symposium 
has been guided by a planning committee which 
over the years has included Paul Watkins, Gary 
Holstrum, Karen Pincus, Dan O'Leary and 
myself. During some of the years, Paul Watkins 
and Gary Holstrum served as co-chairs of the 
Symposium. Administrative support for the 
Symposia has been most ably provided by Mrs 
Ingrid McClendon. 

Of course the most important contribution to 
a symposium of this nature is made by the 
researchers and practitioners that have prepared 
papers and served on various panels and 
discussion groups. This section of Accounting, 
Organizations and Soctety contains a sample of 
such papers and presentations and has benefited 
from the able assistance of the Editors of the 
Journal, particularly Barry Lewis. To all of those 
who have contributed to the first five years of 
the USC/DH&S Audit Judgment Symposium I 
offer my sincere thanks. 
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AUDIT JUDGMENT RESEARCH* 


PAUL E. JOHNSON, KARIM JAMAL and R. GLEN BERRYMAN 
University of Minnesota 


Abstract 
This paper identifies themes in the audit judgment literature and suggests implications of the research that 
supports them. Research on outcome behaviors and investigations of the process of audit decision making 
are stressed. Issues in both methodology and theory are considered. A brief assessment of the current state 
of knowledge in the field of audit judgment research is provided, and an argument is offered for a direction 
of future work based on the idea of context and the need to understand the meaning that tasks have for the 
subjects who perform them. The paper concludes with a brief discussion of data from studies currently in 


progress at the University of Minnesota. 


AUDIT JUDGMENT: BEHAVIOR 


The generally accepted goal of audit judgment 
research has been to understand and improve 
auditor decision making (Libby, 1981). In some 
fields, it is possible to evaluate decisions because 
outcomes are fairly clear. In most audit situa- 
tions, however, there is little or no knowledge of 
criterion variables. The lack of a normative 
criterion for-audit tasks aroused early interest in 
the use of consensus among auditors as a means 
of evaluating auditor judgment. The basis for 
using consensus as an evaluation criterion seems 
to have been provided by Einhorn (1974) who 
suggested that consensus is a necessary, but not 
sufficient, component of expert judgment. 

The concern among researchers as well as 
audit firms regarding different auditors making 
widely differing decisions in the same cir- 
cumstances led to early studies in which consen- 
sus was used as a major dependent variable. 
These studies generally followed the Brunswik 
(1952) lens model paradigm, which relies upon 
the application of linear models such as linear re- 





gression and analysis of variance to draw infer- 
ences about human judgment. 


Lens model studies 

Ashton (1974) began the lens model work in 
auditing by studying cue utilization, consistency 
and consensus of auditors’ internal control judg- 
ments. Ashton’s study was conducted using 63 
practicing auditors from four public accounting 
firms. A majority of the auditors had 2 or 3 years 
experience, and were asked to make preliminary 
judgments which would be subject to review 
and possible revision at higher levels in the or- 
ganization. Each subject was presented with 32 
cases and asked to make a rating judgment of in- 
ternal control strength for each case. The task 
was presented a second time, some 6-13 weeks 
later, in order to assess the consistency of judg- 
ment over time. 

Ashton reported a high consensus between 
auditors (r = 0.70) and consistency over the 
two administrations (r = 0.81). Main effects ac- 
counted for over 80% of judgment variance and 
none of the interactions were significant, 
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suggesting that the auditors did not look for pat- 
terns of answers to the questions. ` 

Ashton & Kramer (1980) replicated Ashton’s 
study with 30 auditing students as subjects, 
using the same six cues. They obtained the same 
pattern of responses as those in Ashton’s original 
(1974) study. The finding of similar results for 
both students and auditors may have been due to 
the fact that the auditors used were not experts, 
or it may have been due to the restriction of 
range for the dependent variable (a six point rat- 
ing scale). 

Joyce (1976) followed up Ashton’s work by 
presenting subjects with a series of combina- 
tions of strengths and weaknesses regarding 
three internal controls in an accounts receivable 
system. The subjects were asked to estimate the 
amount of time to allocate to each of five ac- 
counts receivable audit procedures. The use of 
time (number of hours) avoided the problems of 
restriction of range in Ashton’s rating scales. In 
Joyce's study subjects were requried to do two 
things: 

(1) evaluate internal controls, and 

(2) estimate amount of time to be allocated to 

the procedures. ; 
Subjects could agree on the evaluation of inter- 
nal control (as in Ashton’s study ) but disagree on 
how much time to allocate to specific audit pro- 
cedures. 

Joyce reported a relatively low consensus of 
0.37, based on total time judgments. He found 
that as subjects’ experience increased, consen- 
sus decreased. Joyce also found high stability 
(0.86), virtually no interaction effects and main 
effects that accounted for approximately 75% of 
the total variance. - 

In general, lens model studies have shown that 
cues (main effects) account for most of the var- 
iance, that interactions are not significant, and 
that large differences exist across individuals. 
Not surprisingly, the inconsistencies across indi- 
viduals has been of concern to audit firms. A 
major implication of the lens model research has 
been that decision aids developed to assist au- 
ditors should be based upon the judgments of 
experts, and that these judgments should be 
combined by mechanical (usually mathemati- 


cal) models. (See Libby & Lewis, 1977, for a 
comprehensive review of the literature. ) 

Two major unresolved issues in the lens 
model literature led researchers away from this 
approach in the late 1970s. The first was the lack 
of an accepted normative criterion for evaluat- 
ing auditor judgments. A second, and more fun- 
damental problem, however, was that the lens 
model work did not seem to provide insight into 
the processes being used by auditors in making 
judgments. Thus, researchers in accounting 
were receptive to a new paradigm developed by 
Tversky & Kahneman (1974) which offered a 
normative criterion as well as process descrip- 
tions. 


Heuristics and biases 

In their highly influential 1974 article, 
Tversky & Kahneman proposed that people rely 
on a limited number of heuristics that enable 
them to cope with complex judgment situations. 
Tversky & Kahneman proposed that, while 
heuristics may be useful in many circumstances, 
they can lead to serious and systematic errors 
because they are not influenced by several fac- 
tors that should affect judgments according to 
the normative Bayesian model. For example, in 
the case of a person using the representativeness 
heuristic, the sole criterion used for making 
judgments is perceived similarity. Normatively 
important considerations, such as sample size, 
base rates, data reliability and diagnosticity, are 
ignored because they do not affect perceived 
similarity. 

Early research on neglect of base rates was 
conducted by psychologists using non-expert 
subjects in rather abstract tasks. This work was 
extended by researchers who were interested in 
determining whether auditors exhibit the same 
biases in audit settings. Results of studies such as 
that performed by Joyce & Biddle (1981), have 
indicated that, while auditors exhibit the same 
overall information processing behavior as non- 
auditors, they seem to be less affected by the 
heuristics and biases identified in the earlier 
work. See Libby & Lewis (1982) for a review of 
these and other studies. 

In an important article, Einhorn & Hogarth 
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(1981) defined optimal judgments as ones that 
maximize or minimize some explicit and 
measurable criterion such as profits, errors or 
time, conditional on certain environmental 
assumptions and a specific time horizon. This de- 
finition stressed the conditional nature of opti- 
mality and suggested that the normative rules 


studied in the heuristics and biases work may. 


apply only in simple settings. Einhorn & Hogarth 


emphasized the importance of understanding 


the role of attention, memory and cognitive rep- 
resentation in making judgments. They recog- 


nized that the statistical models used in previous 


work were, in fact, not optimal models in real 
settings and urged researchers to study behavior 
in complex task environments where subjects 
could play an active role in the process of judg- 
ment and choice. 


AUDIT JUDGMENT: PROCESS 


Felix & Kinney (1982) reviewed the eer 
literature and found that very little was known 
about how auditors process and combine infor- 
mation from compliance and substantive tests. 
They suggested that only limited progress has 
been made in the field of audit research. No 
strong model or theory of auditing has been de- 
veloped, and there have been no tests of auditor 
behavior in realistic task settings. In order to 
learn more about current audit practice they 
suggested that researchers could mail surveys, 
conduct field studies (interviews) and examine 
audit working papers, 

The emphasis in the Einhorn & Hogarth 
(1981) and Felix & Kinney (1982) review arti- 
cles on the importance of understanding the 
psychological processes underlying judgment 
and choice, has led to an interest in studying de- 
cision processes. 


Dectsion processes : ` 

An early study of decision processes in audit- 
ing was conducted by Biggs & Mock( 1983) who 
collected verbal protocols from four senior au- 
ditors employed by an international accounting 


_ firm, using a case previously developed by Mock 


& Turner (1981). Results focused on complete- 
ness of search and type of activities subjects en- 
gaged in. Biggs & Mock found that most of sub- 
ject’s reasoning activity took place in what they 
called the information search and information 
evaluation stages of the task. They also found 
two patterns of task behaviour. Two of the sub- 
jects, A and C, performed the task using a sys- 
tematic strategy, consisting ofa thorough and se- 
quential search of available information prior to 
making any decisions. A third subject (subject 
D) used a directed strategy which consisted of 
selecting a particular audit step and then search- 
ing for information relevant to that step. Once 
the decision was made for the first step, the pro- 
cess was repeated until all steps were com- 
pleted. Subject B followed a mixed strategy con- 
sisting of a systematic strategy part of the time, 
and a directed strategy part of the time. A com- 
parison of subject D’s behavior (directed 
strategy) with that of subject A and C (systema- 
tic strategy) showed that subject D took less 
time to do the task, made fewer changes to the 
planned sample sizes and obtained less informa- 
tion than subjects A and C. Although there was 
no normative guideline to conclude which one 
of the two strategies was more effective, the au- 
ditors with more experience used the systema- 
tic strategy. 

A review of the sample size decisions in the 
Biggs & Mock (1983) study revealed an appa- 
rent lack of consensus. However, no explanation 
was provided for a subject’s use of a systematic 
or directed strategy. It is not known, for 
example, what a good decision was for the task, 
what the critical cues were that influenced sub- 
jects’ judgments, nor how these cues were inter- 
preted. co 

Despite such shortcomings, the study by Biggs 
& Mock was an important response to Einhorn & | 
Hogarth’s (1981) critique of the heuristics and 
biases literature. The study focused upon build- 
ing a descriptive model of how auditors make 
decisions and employed a task based upon an ac- 
tual company that was rich in complexity. 

Rather than inspiring research directed to- 
wards the development of audit theory, how- 
ever, the pioneering work by Biggs & Mock 
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(1983) has led to research directed towards im- 
proving decision making using computer sys- 
tems as devices for creating auditor decision 
aids. i 


Expert systems 

Recent studies conducted by researchers in- 
terested in developing computer-based decision 
aids (usually called expert systems) have 
adopted the Biggs & Mock approach to investi- 
gate decision processes, which is based upon the 
use of verbal protocol data. 

In a study by Hansen & Messier (1986) the 
authors describe the results of a preliminary in- 
vestigation of EDP-XPERT, an expert system 
which is intended to assist computer audit 
specialists (CASs) in making judgments as to the 
reliability of controls in advanced. computer en- 
vironments. These investigators conducted a 
protocol study where the objective was to iden- 
tify IF-THEN rules that might be appropriate for 
the system knowledge base. The protocol study 
is discussed in detail in Biggs et al. (1986a), and 
the focus here is on what was done, what was 
learned, and the implications for conducting 
process studies to understand auditor expertise. 
Three managers from one Big Eight accounting 
firm served as subjects in the protocol part of the 
study. Each of the managers was a computer 
audit specialist with 2—4 years of experience, in- 
cluding the auditing of complex EDP systems. 
The case used in the research was adapted from 
client working papers, contained over 40 pages 
of information and was used as part of the firm’s 
training course materials. Each subject was told 
that he/she was replacing the previous CAS who 
was reassigned because of other client commit- 
ments. Subjects were asked to review the client's 
EDP system, evaluate internal controls over the 
sales/receivables cycle and prepare an audit pro- 
gram. The public accounting firm that provided 
the case also had a suggested solution which 
served as a criterion for analyzing subject’s re- 
sponses to the case. 

The data were analyzed as suggested by Biggs 
& Mock (1983). A microlevel analysis was done 
in which protocols were analyzed for knowl- 
edge states and operators (Newell & Simon, 


1972). In addition, a macrolevel analysis was 
done consisting of two parts: (1) decision flow- 
charts and (2) episode abstracts. The flowcharts 
were a graphic representation of the overall de- 
cision process of each subject. The episode 
abstract was a description of the sequence of 
goals set by the subject and the major informa- 
tion processing activities related to these goals. 
The results of the macro analysis were not pre- 
sented in the paper; the authors argued that they 
provided little additional information concern- 
ing the CAS’s decision processes. 

The results reported by Biggs et al. (1986a) 
are the number of lines of protocol generated 
and a classification of operators into three cate- 
gories: (1) information acquisition, (2) evalua- 
tion, and (3) audit decision. All three protocol 
subjects identified a majority of the eight con- 
trols contained in the solution. The types of 
operators found were similar to those found by 
Biggs & Mock (1983). No model was presented 
indicating how internal control judgments are 
made in advanced computer environments, and 
the study did not lead to finding many decision 
rules that could be used to construct an expert 
system. 

The work combining a process approach to 
understanding decision making with the goal of 
building an expert system has led to the develop- 
ment ofa number of working systems. However, 
these systems have not been developed with a 
good understanding of what processing the 
auditor, whose expertise is at the heart of the 
system, is doing. There has been virtually no de- 
velopment of audit judgment theory or hypoth- 
eses to guide future work, especially in realistic 
task situations. (See additional studies by Biggs - 
et al, 1986b; Dungan & Chandler, 1985; 
Meservy et al, 1986; ‘Shpilberg & Graham, 
1986.) i 


AUDIT JUDGMENT RESEARCH: 
AN ASSESSMENT 


In this section the research described above is 
reviewed with the objective of identifying rea- 
sons why there has been only limited progress to 
date. Sources of difficulty in past audit judgment 
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work are identified and an alternative approach 
is illustrated below with an example from work 
in progress at the University of Minnesota. 


Methodology 


The judgment tasks used in lens model re- , 


search as well as in studies of heuristics and 
biases were highly structured, and in some in- 
stances were unfamiliar to subjects. In such tasks 
subjects play a relatively passive role, processing 
only the information given to them. However, as 
Einhorn (1976) has suggested, real decision 
tasks are poorly structured. In reality, informa- 
tion must be searched for, data are not perfectly 
reliable and hypothesis formation and confirma- 
tion/disconfirmation occurs within a broad 
range of possibilities. In most situations, the au- 
ditor must take an active role in searching for as 
well as evaluating information. The processes of 
reasoning that work on the well-structured tasks 
of the psychological laboratory often do not 
seem to generalize to settings of practice (Ebbe- 
son & Konecni, 1980). 

Biggs & Mock (1983) employed more realis- 
tic tasks and focused on decision processes, 
using the methodology of protocol analysis. 
However, the research in auditing that has 
adopted this approach has been conducted in 
such a way that little was learned about how sub- 
jects did the task. Measures employed such as 


time taken to complete the task, percentage of ` 


information searched for, and type of activities 
engaged in, are all descriptions of behavior 
rather than statements of the reasoning process 
underlying this behavior. It is not known, for ex- 
ample, what the critical cues are, how subjects 
interpret them, how the interpretation of 
specific cues affects information.search, nor how 
various cues are combined to make sample size 
decisions. : . 

Although investigations of the audit judgment 
process have tended to employ more realistic 
tasks than ones used in the study of heuristics 
and biases, the use of only one case also makes it 
difficult to separate those aspects of behavior 
which are unique to the case, and those that are 
common across cases. 

Despite an effort to create experimental tasks 


based on real audit cases, the tasks given to sub- 
jects have not always elicited the expertise that 
investigators were hoping to understand. For 
example, Biggs & Mock (1983) and Biggs et al 


-(1986a) constructed audit cases from the work- 


ing paper files and training materials used by 


‘audit firms. While these were complex cases, 


they appeared to be normal audit tasks in which 
subjects could use relatively automated reason- 
ing. The result of such research has been that in- 
vestigators discover (not surprisingly) that ex- 
perts do better than novices. However, no in- 
sight is provided into why experts are better 
since the underlying expertise is not made 
explicit in subjects’ behavior. 

An additional difficulty has been the lack of 
use of recognized experts as subjects. Although 
most “process studies” in auditing have used 
practicing auditors there has been no assurance 
that subjects deemed to be experts have a high 
level of expertise in the task being studied. An 
additional difficulty, inherent in this type of re- 
search, is that only a small number of subjects 
have been investigated in a given study. Studies 
conducted by Biggs & Mock (1983) and Biggs et 
al. (1986a) collected protocols from four sub- 
jects or less. 

Audit research has drawn heavily upon find- 
ings and methodology from the field of experi- 
mental psychology. Investigations such as those 
reviewed here, have often been designed to de- 


-monstrate the conditions under which findings 


discovered in the psychological laboratory hold 
in the context of audit judgment. Such knowl- 
edge is important because it advances an under- 
standing of the human mind, and also because it 
tests the generalization of findings from ‘the 
laboratory to the world of everyday experience. 
Unfortunately, as others within the field have 
noted (e.g. Felix & Kinney, 1982 ), such work has 
not advanced an understanding of the nature of 
the behavior of interest. After several years of re- 
search, there is no theory of audit judgment, and 
very little understanding of what auditors actu- 
ally do. 


Context 
Part of the reason why audit research has 
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taken its current direction may be due to its 
preoccupation with providing decision aids for 
the practicing auditor. Even when studies 
moved toward the investigation of process, re- 
searchers have concentrated on ‘extracting 
“rules” from data to be used in the implementa- 
tion ofa computer program, rather than building 
a theory from which such rules might be de- 
rived. However, the rules that comprise an ex- 
pert.system knowledge base are seldom present 
explicitly in an expert’s behavior. (They are, in 
fact, representations of the underlying knowl- 
edge.) Such rules must be created from an un- 
derstanding of how the task is done. Research on 
expertise is intended to lead to the discovery: of 
the nature of the underlying knowledge and 
reasoning, not as is sometimes supposed, to find 
“cules” in the data obtained during the perform- 
‘ance of a task. 

Audit research is concerned not just with 
observed behavior, but also with the thoughts 
and knowledge of highly trained professionals 
who are carrying out: complex tasks in a tightly 
constrained problem-solving context. Unlike 
the experimental psychologist who is interested 
in describing fundamental aspects of human cog- 
nition, the audit researcher is concerned with 
discovering the knowledge and reasoning pro- 
cess of a skilled problem solver. The goals of the 
two efforts are related, but also fundamentally 
different. 

As an example of what happens when the 
methods of one field are adopted in another, 
consider again the work on decision processes. 
The work by Biggs and others (e.g. Biggs & 
Mock, 1983) applied methods of analysis to 
problem-solving data-that did not describe most 
of the contextual knowledge developed by sub- 
jects in the course of learning to carry out the 
‘kinds of tasks investigated in the research. The 
knowledge states and operators adopted in this 
work for describing reasoning processes were 
developed by Newell & Simon (1972) to 
describe the process of solving well-structured 
problems. For such problems the process of 
problem-solving can be viewed as a state-space 
search (Nilsson, 1980). Despite Simon’s (1973) 
suggestion that ill-structured problems might be 


solved by partitioning them into previously sol- 
ved well-structured problems, it is unlikely that 
the expertise of the professional problem solver 
will be captured solely in these terms. 

The result of such work is that the process of 
audit judgment is described.in terms of discrete 
units of behavior rather than concepts from a 
theory or model of the process of audit judg- 
ment itself. An analogous problem in physics 
would be for tħe motion of bodies to be de- 
scribed solely in terms of “pointer readings” on 
spring balances and meter sticks instead of con- 
cepts such as force, mass and acceleration. Part 
of the difficulty faced in attempting to under- 
stand the world of everyday experience is that 
the paradigm that guides thinking and research 
has been behaviorism, rather than contex- 
tualism (Pepper, 1972). According to the con- 
textual metaphor, it isn’t only the behaviors that 
can be inferred from the data on a task, but also 
the meaning these behaviors have for the sub- 
jects who express them. 

The issue of “meaning” is one of the funda- 
mental problems that has occupied psychology 
since the so-called psycho-linguistic revolution 
of the 1960s (Chomsky, 1959; Jenkins, 1968). 
Although a description of this issue in the field of 
psychology is beyond the scope of this paper, it 
is worth noting that meaning, and its companion 
problem of knowledge, have been responsible, 
in part, for the development of the relatively 
new field of cognitive science. The remaining 
portion of this paper briefly sketches one alter- 
native approach to conducting research on audit 
judgment which is based on the attempt to deal 
with these problems. 

At the outset it is possible to distinguish three 
types of theory: a theory of the auditor, a theory 
of auditing and a theory of how to audit. The first 
is a theory of persons or individuals who carry 
out a specific kind of task. The second is a theory 
of that task, and the third is a theory of a specific 
kind of activity. The first two theories are de- 
scriptive, the last is prescriptive. The first two 
are developed by investigators of the 
phenomena of auditing. The last can be de- 
veloped by audit practitioners. 

The research reviewed here has been largely 
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concerned with auditor behavior, and though 
there has been relatively little theory developed, 
what exists falis under a theory of the auditor. 
Books on auditing written by practitioners and 
scholars in the field comprise theory of the third 
kind. These are useful as a source of insight into 
what practitioners know and do, but they do not, 
by themselves, provide theory to guide research. 
The second kind of theory has been most neg- 
lected, and it is the second kind of theory that 
provides a basis for understanding what subjects 
know that enables them to perform a task. 

In information processing terms, a theory of 
auditing is a theory of what must be computed 
by any processor that attempts to perform an 
audit task. Different types of processors have dif- 
ferent processing constraints and will imple- 
` ment a given computation in accordance with 
these constraints (e.g. a major processing con- 
straint for the human processor is short-term 
memory). In this sense a digital computer per- 
forms the task of auditing differently than a 
human being. The important point, however, is 
that for a computation to be performed by either 
type of processor, it is necessary to have a state- 
ment of the requirements for the task of auditing, 
a statement of what must be done as opposed to 
how this is done (Johnson et al, 1987). 

The distinction between what and bow with 
regard to the performance of a task is similar to 
Newell’s (1981) distinction between the knowl- 
edge level and the symbol level in any informa- 
tion processing system. From the perspective of 
the argument presented here, it is important to 
know first about the nature of the task before de- 
scribing how it is performed. A theory of audit- 
ing is a theory of the knowledge that is required 
for doing auditing. At the level of an expert au- 
ditor, this operative knowledge is the expertise 
required to perform the audit task. 

If expertise is the knowledge that is required 
to perform a given task, then it is also expertise 
that enables subjects to understand the meaning 
of characteristics of that task. Interestingly, it is 
also expertise that lies at the heart of the audit re- 
searchers attempt to build decision aids for the 
practicing auditor. The difficulty with the ap- 
proach taken by audit researchers thus far is that 


the formalism of a representation (e.g. rules and 
frames) has been confused with the content of 
the representation. Rather than look for “rules” 
in the problem-solving data of a task, analysis 
must identify evidence of the expertise that 
guides what the task performer does. 


AUDIT RESEARCH: AN EXAMPLE 


The following example is based on an attempt 
to come to grips with some of the issues raised in 
this paper. In one sense the work falls victim to 
criticisms made of other work in auditing (e.g. a 
small number of subjects and only a few tasks). 
The research does, however, have three import- 
ant characteristics for purposes of the present 
discussion: (1) the tasks developed for presenta- 
tion to subjects are based upon a genuine audit 
issue — namely, fraud detection, (2) the analysis 
of data focuses upon the knowledge that is 
necessary to carry out an audit task, and (3) the 
tasks are challenging and provide an opportunity _ 
for expert auditors to make errors. 

The attempt here is to develop a theory that 
describes the knowledge required to find ir- 
regularities and unintentional errors in financial 
statements. This kind of theory, which is at the 
“knowledge level”, is termed a theory of exper- 
tise; it describes the necessary conditions for. 
successful task performance (Mohr, 1982). 

One means of generating data relevant to un- 
derstanding what subjects know is through the 
use of tasks that are challenging enough to elicit ` 
errors, even from proficient and highly expert 
individuals. Such tasks are useful because they 
provide insight into reasoning processes, enable 
the identification of critical cues and aid in de- 
termining how these cues are interpreted by 
subjects in the performance ofa task. Construct- 
ing tasks in which subjects make errors also 
reveals the limits of a subject’s adaptation, and 
permits inferences about the process that under- 
lies task performance (Simon, 1983). The use of 
these kinds of tasks permits an examination of 
behavior uncontaminated by the relatively auto- 
mated responses elicited by more familiar tasks. 
By investigating the limits of an individual’s cog- 


90 PAUL E, JOHNSON et al. 


„nitive ability in realistic tasks, it is also possible 
lto distinguish expert and novice auditors, not 
only on the basis of outcome measures but in 
terms of the processes of problem solving as 
well. 

In the work presented here the focus is on the 
task of fraud detection. In such a task the detec- 
tion of irregularities requires a high level of ex- 
pertise. Irregularities are defined as intentional 
distortions of financial statements such as frauds 
perpetrated by managers who are high enough 
in an organization to have the power to override 
the accounting controls, or who are not bound 
by such controls. When management fraud 
exists, compliance and substantive tests are not 
necessarily effective. A key deterrent to the iss- 
uance of an unqualified audit report when mis- 
leading financial statements are prepared is the 
expertise of the auditors who review the 
financial statements before the firm issues an 
audit opinion. 

The task of concurring partner review was 
selected for the investigation of expertise in this 
context since such reviews are commonly used 
by large audit firms to provide an independent 
check on the fairness of financial statements for 
publicly held companies and for sensitive audit 
clients before issuance of the audit report. 
` Frauds can be considered to be deliberately con- 
structed garden paths (Johnson et al, in press). 
According to this argument, perpetrators of 
frauds design them so that their auditors will be 
misled as to the financial position and/or operat- 
ing results, and an unqualified report will be is- 
sued. 


Subjects 

The research described here is part of a long- 
term research project. The results presented 
were obtained from two subjects in a pilot study 
for a larger investigation currently in progress. 
Subject S1 is a partner in a Big Eight audit firm 
and has 40 years of audit experience. He is desig- 
nated as an SEC Partner and has conducted 
numerous concurring partner reviews over an 
extended period of time. He is classified as an ex- 
pert for purposes of this study. Subject S2 has 
been a partner for about two years in another Big 


Eight audit firm and has 11 years of audit experi- 
ence. He is classified as a novice. 


Tasks 

The subjects were given three audit cases and 
asked to review them with the objective of 
determining whether an unqualified opinion as 
recommended by the engagement partner is ap- 
propriate for issuance. The cases were con- 
structed using the annual reports and 10K re- 
ports of actual companies. In this paper, results 
are presented for only one case, a medical pro- 
ducts firm, where a fraud was perpetrated by 
senior management. The auditors of the com- 
pany, a Big Eight audit firm, issued an unqualified 
audit opinion. The other two cases, consisting of 
a training case and a case containing an uninten- 
tional error, will not be discussed further. 

In the medical products case, senior manage- 
ment of a medical products firm overstated net 
income by approximately 90% . The case is com- 
plex (the summary is 20 pages in length), involv- 
ing a large number of issues. Subjects were pro- 
vided with a narrative description of the busi- 
ness and a set of financial statements and notes 
thereto in standard financial statement format. 
No audit working papers were provided. Sub- 
jects were asked to review the case and indicate 
whether they would be willing to sign an unqual- 
ified audit opinion, and what questions, if any, 
they would investigate further. 

A garden path (Johnson et al, in press) was 
present in the case due to its description as a 
high tech, high growth firm in the medical pro- 
ducts industry. This description can lead the un- 
wary auditor to expect large increases in sales 
and net income, which is exactly what the per- 
petrator of the fraud has done — inflated sales 
and net income. There are four irregularities in- 
volved in the case, all of which are income en- 
hancing: 

(1) Unordered goods were shipped to 
cooperating distributors and recorded as sales. 
The goods were returned after the end of the 
period. 

(2) Cost/unit of items in Finished Goods In- 
ventory was arbitrarily increased. 

(3)Some expenses (e.g. research and de- 
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velopment) were improperly capitalized during 
the year. 

(4) An accounting policy change was made to 
capitalize tooling costs. The change is question- 
able. 

A number of cues in the case are consistent 
with what one might expect from a high growth, 
high tech company: (1) the company has a re- 
search and development division which is intro- 
ducing new products; (2) the company does not 
pay dividends; (3) there is a large increase in 
fixed assets, and (4) large increases in sales and 
net income have been reported. A number of 
cues are also present which divert the subject’s 
attention away from the actual problems of the 
firm: (1) the existence of foreign subsidiaries 
with results needing analysis, (2) tax payable 
shown as zero on the balance sheet; (3) tax ex- 
pense declining substantially as a percentage of 
net income; and (4) gross profit as a percentage 
of sales declining substantially. These “blind al- 
leys” lead the thinking of the auditor away from 
the possibility that the company is trying to man- 
ipulate income. 


Data analysis 

The protocol data from the two subjects were 
analyzed to identify what cues were interpreted, 
what evaluation of alternative solutions was 
undertaken and what final audit opinion was 
reached. This analysis required a series of trans- 
formations of the raw data of the problem-solv- 
ing record into a more abstract representation. 
The analysis employed here consisted of three 
levels of protocol translation. The goal of level 1 
was to identify the problem representation gen- 
erated by each subject. The goal of level 2 was to 
determine a competing set of possible solutions. 
The goal of level 3 was to identify the line(s) of 
reasoning used by subjects in deciding upon the 
proper audit opinion. 

In level 1 analysis, each protocol was coded in 
terms of script—episode pairs as responses to a 
cue or set of cues. The protocol was coded in 
terms of scripts to permit the identification of 
the information the subject processed. This is 
analogous to “knowledge states” analysis prop- 
osed by Newell & Simon (1972), though at a 


higher level of abstraction which allows for the 
inclusion of domain knowledge. Each protocol 
was partitioned into a series of scripts separated 
by natural boundaries of the task such as plan- 
ning documents, balance sheet, income state- 
ments, etc. Each individual financial statement 
was considered to be a single script. Episodes 
represent cognitive actions taken by the subject 
and are analogous to “operators”, also at a level 
of abstraction which involves concepts from the 
domain of auditing. Episodes represent seg- 
ments of behavior taken to achieve a goal. 

The episodes used by the two subjects in re- 
sponding to the fraud case were categorized to 
identify underlying audit concepts that could 
serve as abstractions about the data of the case. 
This process led to the identification of liquidity, 
litigation, growth, disclosure, tax and income 
manipulation as possible categories for evaluat- 
ing cues. These categories comprised the prob- 
lem representation generated by the subjects. 

In the level 2 analysis, the categories iden- 
tified in level 1 were again categorized. The ob- 
jective here was to understand relationships 
among the possible categories. It was important 
to determine (for example) which categories 
went together, which clusters of categories 
competed with each other as possible formula- 
tions of the problem, and within each cluster, 
the specific set of categories from which the au- 
ditor had to choose a solution. This analysis re- 
sulted in the identification of a set of competing 
solutions (competitor set). This set of alterna- 
tives consisted of specific types of audit opinions 
which the auditor could issue, such as an unqual- 
ified opinion, various kinds of qualified opinions 
and an adverse opinion. 

In level 3 analysis, relationships among cues 
were identified. The objective of this analysis 
was to identify cues that were combined con- 
figurally by subjects as a means of detecting pat- 
terns in the data. After a problem representation, 
the competitor set of possible solutions and rela- 
tionships among cues were identified, a qualita- 
tive model of the problem-solving process used 
by each subject was constructed. In this model, 
cues were assigned to solutions, and each sub- 
ject’s protocol was analyzed to determine how 
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each cue was interpreted. Finally, cues were 
coded (+ / —) as being consistent/inconsistent 
with the solution being evaluated. 

Once a qualitative model was constructed for 
each subject, the lines of reasoning used to inter- 
pret cues were identified. It was assumed that 
subjects might experience difficulty in solving 
the problem, either because they did not attend 
to critical cues or because they attended to crit- 
ical cues but failed to interpret them correctly. 

Knowledge of auditing entered into decisions 
made at each level of data analysis. The goal of 
the analysis was to obtain a level of abstraction 
such that the description of what the subjects 
were doing (qualitative model) would incorpor- 
ate the knowledge of the practicing auditors and 
represent the sources of meaning used by sub- 
jects in interpreting the data of the case.’ 


Results 
' As a Starting point for the data analysis, a de- 
scription of the requirements for performing the 
task of concurring partner review was con- 
structed.” According to this description (shown 
in Fig. 1), the problem solver first builds a pic- 
ture of the firm. He/she develops an understand- 
ing of the nature of the client’s business, specifi- 
_ Cally, the types of products and markets the com- 

pany is engaged in, the financial stability of the 
company and any special reporting require- 
ments. This representation gives the auditor a 
set of expectations for use in analyzing the data 
of the case. Overall risk analysis is determined 
using general business knowledge, industry 
knowledge and knowledge of the specific busi- 
ness being audited. Materiality judgments are 
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made using these types of knowledge to narrow 
down the number of items on which to focus at- 
itention..In the overall risk analysis, accounts 
‘with large balances and high risk amounts (large 
and unusual amounts) are identified using 
knowledge of the business. Once the key 
accounts have been identified, the data are re- 
viewed for consistent and inconsistent informa- 
tion. The data are compared with the representa- 
tion and changes in one account are cross refer- 
enced with other related accounts. Notes to the 
financial statements are reviewed, and cross re- 
ferenced with the representation of the com- 
pany and related account balances. 

The set of possible solutions that can be 
realized by performing the task in the present 
case was: 

(1) unqualified opinion (the garden path re- 
sponse ) 

(2) qualified opinion related to one or more of 
the following: litigation, liquidity, disclosure and 
consistency (four blind-alleys) 
© (3) adverse opinion (correct answer). 

The description shown in Fig. 1 was used to 
segment the protocol data from each of the two 
subjects. Based upon this initial segmentation 
and a subsequent analysis as described above, an 
interpretation (qualitative model) of subject be- 
havior was developed. This model is in two 
parts: a representation used by the subject to 
frame the problem; and a line of reasoning used 
to evaluate cues in the case, based on this rep- 
resentation. According to the analysis, both sub- 
jects first generated a context for the case in 
which the company was interpreted as a high 
tech, high growth firm in the medical products 


‘It is important to establish the reliability of the data on which inferences are based in any investigation of psychological pro- 
cesses. It is especially important in the case of protocol data to determine that any proposed representation of a subject's 
problem-solving behavior can be agreed upon by multiple scorers. A second coder was employed to establish the reliability 
of scoring for both the identification of cues and the assignment of cues to lines of reasoning in the present study. The prop- 
ortion of agreement between the two coders was 0.90 for the identification of cues and 0.82 for the assignment of cues to 
lines of reasoning. Both values are well within the range found in other, similar investigations (e.g. Johnson et al., 1982). A 
coefficient of agreement generally referred to as Cohen's K (see Cohen, 1960) of 0.72 was also computed for the assignment 
of cues to lines of reasoning. Cohen's K is used to measure the proportion of agreement between two scorers in placing items 
into a set of K unordered categories. K is directly interpretable as the proportion of judgments in which there is agreement, 
after chance agreement is excluded. Although no sampling distribution exists for the K statistic, Cohen has argued that K val- 
ues can be transformed into Z scores. When this is done the K value of 0.72 is significant at p < 0.001. 

*We take the requirements for a task to be a description of what must be done in order to perform in rather than bow these 
activities are carried out for a particular instance of the task at a given moment in time (see Johnson et al, 1987, for a further 
discussion of this issue). 
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Fig. 1. Requirements for the task of concurring partner re- 


view. 


industry. This context led to expectations of a . 


large investment in research and development, 
the possibility of inventory obsolescence, pro- 
duct liability litigation, and increases in sales and 
net income. The high tech, high growth context 
, provided a “story” which the subjects used to in- 
terpret cues, and provided a plausible explana- 
tion for the increase in sales, sales returns, inven- 
tory, accounts receivable and fixed assets. It also 
made it difficult to identify the frauds perpet- 
rated by the management of the company. 
Based upon their initial context for the case, 
each subject then constructed a problem rep- 


resentation in which potential solutions and’ 


cues were related to one another. The general 
form of this representation for both subjects is 
shown in Fig. 2. ; 
According to Fig. 2, income taxes in the case 
were neutral (0) with respect to the firm 


because there is a good explanation for the 


change noted. The cues in the task are part of the 
“growth” subcategory, provide strong evidence 
in support of the interpretation of the firm [con- 
firming evidence (+ )]. They also show the sales 
returns and gross margin cues which are incon- 
sistent with a growth company. If these cues are 
combined with the “growth” cues, they become 
diluted and do not affect the outcome. If they are 


evaluated separately, they enable the auditor to 
detect the fraud. 

The cues which suggest that the auditor’s re- 
sponse should be one of several qualified audit 
reports are also shown in Fig. 2. The cues relat- 
ing to changes in accounting methods can be in- 
terpreted in one of three ways. If interpreted in 
terms of profitability, they suggest an income 
manipulation (fraud); if interpreted in terms ofa 
qualification, they suggest a consistency qualifi- 
cation; and if interpreted in terms of disclosure, 
they suggest an unqualified (clean) opinion. The 
litigation and going concern cues are evaluated 
with respect to issuance of a qualified opinion. 


TAXES 9 PROFITABILITY (P) 
+ e 








DISCLOSURE 






Fig. 2. Problem representation for the task of concurring 
partner review. + = consistent, — = inconsistent and 0 = 
neutral with respect to the LOR being evaluated. 

The expert subject (SI) noticed a decline in 
gross profit, which is inconsistent with a growth - 
company and a large increase in sales returns. 
Based on his experience, he knew that com- 
panies in this industry tend to manipulate in- 
come by shipping goods to “co-operative” dis- 
tributors and recording sales revenue in the cur- 
rent period, and then accepting the goods back 
and recording sales returns in the subsequent 
period. Based on his specific industry knowl- 
edge, the increase in sales returns was a strong. 
indication as to the existence of the fraud, This is 
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illustrated by the following excerpt taken fro 
his protocol: . 


Return policy focuses primarily on the key question of 
the real risk that recetvables from distributors, with the 
high level of return; may be cycled through co-operative 
distributors on the basis that after the end of the year the 
merchandise could be returned, and the result of that 
would be to inflate both the company’s sales and its pro- 
fits, since merchandise which is on consignment cannot 
be recorded as sales and certainly profit can’t be re- 
corded. However, there has been history in this industry 
of that kind of thing occuring on the basis of either writ- 


ten or unwritten agreements to take back merchandise - 


from distributors. The high tech industry bas been 
plagued with that! 


The novice subject went down the garden 
path and was willing to issue an unqualified 
(clean) opinion. He explained away the early 
discrepant cues with reference to the growth 
story. For example, the increase in sales returns 
was explained by assuming that the company 
must have improved its warranty policy. 


A dramatic increase in returns and allowances relative to 
the not that great increase in sales. I guess the warranty 
quality issue remains a concern as I go through these 
statements. 


In a similar way the novice explained away the 
increase in inventory by assuming that the com- 
pany was stockpiling inventory because they 
must be expecting a major sale. 


I would pose the question of whether or not we're plan- 
ning for some markedly increased deliveries and if we 
are, do we have commitments for these deliveries, do the 
orders exist to absorb all of this inventory. 


Since he was unable to find any problems in the 
financial statements, the novice was willing to 
issue an unqualified opinion. 

Domain-specific problem-solving strategies, 
termed lines of reasoning (LOR), were next 
identified (Johnson ef al., 1982). For the medi- 
cal products case, a LOR represents related path- 
, ways of thought incorporating generally ac- 
cepted auditing standards, auditing concepts, ac- 
counting practices, and industry knowledge, or- 
ganized through a network of relationships that 
specify potential solutions to the task. It is used 


to interpret (give meaning to) cues in the task. In 
the medical products case subjects employed 
three separate lines of reasoning: 


Proftiability LOR. Here the company is 
viewed as a firm in which a growth in net income 
is caused by an increase in sales. The audit risk 
relates to an overstatement of net income. There 
is the possibility of detecting a manipulation of 
income, in which case the auditor will issue an 
adverse opinion, unless the statements are re- 
vised appropriately. This line of reasoning con- 
tains an assertion that can be tested, namely that 
the company is growing and that this growth is 
responsible for the income being reported. It 
also generates expectations regarding expendi- 
ture on research and development, investment 
in plant and equipment and increases in sales, 
net income, receivables, and inventory. 


Qualification LOR. In this line of reasoning 
the auditor considers items such as litigation, 
liquidity and changes in accounting policies as 
leading to specific types of qualifications in the 
standard audit report. When the expert auditor 
uses the qualification LOR for a high-tech com- 
pany, he is able to anticipate items such as litiga- 
tion, warranties and inventory obsolescence and 
develops expectations for the audit risk as- 
sociated with these items. 


Disclosure LOR. This line of reasoning is based 
on the auditor’s assessment of the adequacy of 
disclosure of the various aspects of the organiza- 
tion. For a high-tech company this leads to ex- 
pectations regarding disclosure of items such as 
outstanding litigation, commitments for expan- 
sion, and financing arrangements. 

Figure 3 presents a trace of the line(s) of 
reasoning adopted by the expert subject (SI) at 
points-in the case where each major cue was pre- 
sented. As shown in Fig. 3, the expert used the 
profitability LOR to interpret most of the early 
cues. He reviewed all the material to identify un- 
usual and high risk items. The growth cues were 
predominantly interpreted using the profitabil- 
ity LOR, and cues such as lawsuit and increase in 
long-term debt were interpreted using the qual- 
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Nature of products 
Obsolescence 
Lawsuit 


* Change in current assets 


Increase in accounts receivable 

Increase In allowance for accounts 
recelvable 

Increase in Inventory (finished goods) 

Increase in PPE 

Increase in accounts payable 


. Tax/pay = zero 

. Increase In long-term debt 

. Increase in deferred taxes 

. Increase in retained earnings 

. Aging of payables 

. Additional paid-in capital 

. Increase in sales 

. Increase in sales returns 

. Increase in COGS 

. Small tax provision 

. Method of recording sales 

. Inventory in hands of distributors 
. Obsolescence (Inventory) 

. Inventory turnover 

. Increase in accounts receivable 

. Restricted stock 

. Sale of common stock 

. Capitalization policy for PPE 

. Deferral of patent costs 

. Sale of product line 

. Increase in working capital 

. Financing of working capital 

. Statemant of changes in S/E 

. Foreign currency transactions 

, Change In accounting (molds and dies) 
» Disclosure of R & D costs 

. Capitalization of R & D costs 

. Disclosure of ITC 

. Revenue recognition 

. Interest capltziization 

. Change in accounting estimate 

. Disclosure of long-term debt 

. Common stock note 

. Tax note 

>- Commitmants and contingencies 
. Increase In manufacturing capacity 
. Disclosure of depreciable lives 

. Relationship between accounts receivable, 


sales and sales returns 


. Return policy on sales 


Fig. 3. Expert line of reasoning. 


QUALIFICATION 


DISCLOSURE 
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ification and disclosure LOR, respectively. He 
then came back and followed up the issues iden- 
tified in the review. Based on his review, S1 
noticed a large increase in sales returns (cue 17) 
which was considerably in €xcess of the increase 
in sales (cue 16). He used the profitability LOR 
to interpret this cue. Based on his experience, he 
knew that companies in this industry tend to 
manipulate income by overstating sales to 
“cooperative” distributors and then record sales 
returns in subsequent periods. This contradicts 
the assertion provided by the profitability LOR 
that growth in demand for the company’s pro- 
ducts is responsible for the income reported. 
. Because the expert has an alternative explana- 
tion for the increase in income, he is able to de- 
tect the fraud. As can be seen in Fig. 3, the expert 
kept all three lines of reasoning “open” until he 
had completed his review. He used the profita- 
bility line of reasoning to correctly interpret the 
critical sales returns cue. 

Figure 4 provides a trace of the line of reason- 
ing used by the novice subject (S2). This subject 
went down the garden path, and issued an un- 
qualified (clean) opinion. He explained away 
discrepant cues with reference to the growth 
subcategory. As discussed earlier, the increase in 
sales returns (cue 18) was explained by assum- 
ing that the company must have improved its 
warranty policy. The increase in inventory (cue 
8) was explained by assuming that the company 
was stockpiling inventory in anticipation of a 
major sale soon after year end. The novice did 
not find any problems in the balance sheet and 
income statement, so he dropped the profitabil- 
ity LOR and concentrated mostly on disclosure. 
This can be seen clearly in Fig. 4, where the early 
cues are being interpreted using the profitability 
LOR. However, instead of looking for areas of 
risk, the novice is trying to make all the cues fit 
together. Due to this, the discrepant cues are 
explained away. While reviewing the notes to 
the financial statements the novice doesn’t 
notice the decline in research and development 
expenses and is not able to combine the various 
income-enhancing cues. Discrepant cues which 
come at the latter stage of the task (accounting 
policy changes — cues 26, 29, 32, and 34) were 


evaluated with respect to their effect on disclo- 
sure, and the subject was willing to sign an un- 
qualified opinion after some disclosure and 
financial statement presentation items were 
“cleaned up”. For the novice, early closure of 
alternative lines of reasoning led to problem-sol- 
ving errors (a result consistent with those ob- 
tained in medicine, e.g. Elstein et al., 1978). 

In general, the analysis presented here reveals 


-that the expert subject used a highly efficient 


problem-solving strategy. Using a combination 
of three lines of reasoning, the expert did an - 
overall risk analysis and identified areas of con- 
cern. He used his knowledge of the industry to 
identify problems and performed the calcula- 
tions required to support his conclusion. His 
tolerance level for errors was well tuned and he 
was able to find the problems embedded in the 
case. This result suggests the hypothesis that ex- 
pert auditors may have knowledge of the types 
of manipulations that are likely in a particular in- 
dustry. When a set of critical cues are combined 
appropriately, they represent a pattern that is 
recognized as a fraud. The analysis also suggests 
that novices may not have such knowledge, and 
are thus far less likely to find frauds, even though 
they pay attention to the same cues as the ex- 
pert. The industry-specific nature of this knowl- 
edge suggests that industry specialization may 
lead to the creation of knowledge structures 
which enable auditors to conduct industry- 
specific reviews of financial statements. In addi- 
tion to having this industry knowledge, it may be 
important that the auditor be able to keep the 
relevant lines of reasoning “open” so that he/she 
has an opportunity to recognize problem situa- 
tions when they occur. 


SUMMARY AND CONCLUSIONS 


This paper has reviewed a “slice” of work in 
the field of auditing. This slice consisted of re- 
search that attempted to understand audit judg- 
ment. Beginning with the lens model work of 
Ashton (1974), and continuing through the 
heuristic and bias work of Joyce & Biddle 
(1981), there was a tendency to rely on “laborat- 
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Fig. 4. Novice line of reasoning. 
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_ory versions” of audit tasks. Moreover, little de- 
velopment of constructs or theory seems to 
exist to explain auditor behavior. Even when re- 
search moved toward detecting auditor decision 
processes, researchers addressed the short-term 
goal of looking for “rules” to be used in building 
decision aids for practicing auditors rather than 
the longer term goal of developing concepts and 
principles that might guide subsequent re- 
search. The upshot of the “themes” identified in 
the work examined here is that the field of audit 
research finds itself looking to other fields for 
ideas and insights. 

Rather than continue in this way, it has been 
suggested that the field of audit judgment is a 
worthy area of investigation in its own right. This 
means that the problems auditors encounter and 
the methods suitable for addressing them should 
arise from the field itself, rather than from the 
work of investigators in other disciplines. As an 
example of how such an argument might be 
made, the role of context and meaning in audit 
judgment tasks was considered. More specifi- 
cally, it was proposed that a theory of the task of 
auditing be developed as a means of providing 
hypotheses regarding the operative knowledge 
underlying successful task performance. This 
knowledge, called expertise, is used by auditors 
at all levels to give meaning to the characteristics 
of the tasks they perform. An example of work 


designed to describe the expertise of practising 
auditors in the task of concurring partner 
reviews and fraud detection was presented to il- 
lustrate the proposed methodology. 

There is no single best way to do research. A 
major strength of any field is the diversity of ap- 
proaches it tolerates and the atmosphere it pro- 
vides for the exchange of ideas on the results and 
merits of each alternative. What is proposed 
here is but one of a number of directions audit 
research might proceed from its current posi- 
tion. The important thing is to recognize that ul- 
timately it is the thoughts and actions of the 
practicing auditor that need to be explained. 
For, as Ulric Neisser in his book Cognition and 
Reality notes: f 


. . .The prediction and control of behavior is not primarily 
a psychological matter, What would we have to know to 
predict how a chess master would move his pieces or his 
eyes? His moves are based upon information he has 
picked up from the board so they can only. be predicted 
by someone who has access to the same information. In 
other words, the aspiring predictor would have to under- 
stand the position at least as well as the master does. He 
would have to be a chess master himself. If I play chess 
against a master he will always win precisely because he 
can predict and control my behavior while I cannot do 
the reverse. To change this situation I must improve my 
knowledge of chess, not my knowledge of psychology ... 
(Neisser, 1976, pp. 182-183). 
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LEE. ROY BEACH and JAMES R FREDERICKSON : 
Sna of binato, 


Abstract 
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Hic apona aayah or aadi ecoa iy Watler'& Potix (A964 [lal Mody, S. and Joyce, E. (eds) 

` Deciston Making and Accounting: Current Research (Norman, OK: University of Oklahoma Press, 1984)], 
an alternative to the expected utility and “heuristics and biases” characterization of decision-making is 
presented. In this alternative, Image Theory, decisions are about adoption of goals and adoption of plans'for 
attaining the goals, as well as about whether the plans are succeeding. The audit process is interpreted in 
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Behavioral accounting has enthusiastically 


adopted the concepts provided by expected 
utility theory and the “Heuristics and biases” re- 
search (e.g. Ashton, 1982). These concepts are 
predicated upon statistical and economic ways 
of framing decision problems. As such, they both 
prescribe and presume a deliberative, analytic 
approach involving decomposition of the deci- 
sion problem, quantitative: evaluations of its 
components, mathematical (usually compensat- 


ory ) recomposition of the evaluations, and appli- ` 


cation of a consistent decision rule, ordinarily a 
form of maximization. ` l 


One cannot deny the attractiveness of these 


concepts, their prescriptions, and of the statisti- 
cal and economic manner of thinking that gave 
rise to them. They appear rational, they are 
couched in the precise language of mathematics, 


and they have generated a very large scholarly 


literature. However, ‘although they may be at- 
tractive prescriptively, the suspicion lingers that 
they miss the essence of how decisions actually 
are made (Hershey & Shoemaker, 1980; 
Mintzberg, 1975; Peters, 1979; Selznick, 1957; 
Schwab et al., 1979). The purpose of this paper 
is to address this lingering suspicion by bringing 


together two parallel lines of thought about an’ 


alternative characterization of decision-making. 
One line of thought is from behavioral account- 
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ing (Waller & Felix, 1984) and one is from be- ` 
havioral decision research (Beach & Mitchell, 
1987). Both lines of thought differ substantially 
from the generally accepted characterization of 
how decisions are made. Our goals are to pro- ` 
voke debate about the adequacy of the accepted 
characterization for auditing decisions and to 
propose an alternative. 


THE FIRST LINE OF THOUGHT: 
THE SCHEMATIC MODEL OF AUDITING 


In 1984 Waller & Felix proposed a new model 
of how auditing takes place. Their thesis was that 


the auditor reaches an opinion about the ab- 
. sence of material error in a set of financial state- ` 


ments through a series of revisions and modifica- 

tions of his or her knowledge structure. These . 
revisions and modifications are made in light of .. 
audit evidence about account balances and, © 
about the procedures used by the client to col- . 

lect and store accounting information. The 
auditor’s knowledge structure both guides the 


‘search for and interpretation of the evidence 


that modifies it, and the structure’s modified 
form represents the current state of that evi- 
dence vis-a-vis requirements that the client’s 


. data and procedures must meet. - 
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Ibe schemata 

The auditor’s knowledge structure is com- 
posed of various “template schemata” that:to- 
gether abstractly represent the audit. These 
schemata provide templates for judging mate- 
riality, the maximum allowable. audit risk, the 
minimum allowable justification for a conclu- 
sion about audit risk, etc. But the templates are 
not absolute. They depend upon how the 
auditor’s opinion ultimately will be used and the 
perceived cost of a wrong opinion. 

The auditor is seen as having an abstract 
schema for a “normal audit” and an approp- 
riately different one for a “problem audit”. If, 

` as evidence is acquired there is a lack of “fit” be- 
tween the evidence and the schema for a normal 
audit, the auditor will switch to the problem 
audit schema, which usually entails more de- 
manding criteria for information search and for 
the quality of evidence. 


The audit 

The opinion formulation process is divided 
into four steps: “(1) deciding to perform the 
audit, (2) gaining an understanding of the client 
... and a preliminary evaluation of internal ac- 
counting controls . . . , (3) planning and execu- 
tion of audit activities . . . , and (4) forming an 
opinion” (Waller & Felix, 1984, p. 37). Step 1 in- 
volves deciding about the “auditability” of the 
potential client, which is assessed by the “fit” be- 
tween available evidence and the schema for a 
normal audit. 

Step 2 involves obtaining evidence about the 
“fit”? between how the client’s accounting infor- 
mation system actually transforms economic 
activity into accounting numbers and the au- 
ditor’s schema for how that transformation 
ought to be accomplished. The outcome of this 
" evaluation sets the criteria that must be met by 
evidence obtained during the execution of the 
audit. 

Step 3 involves planning and execution of the 
audit in light of the criteria set in step 2. Knowl- 
edge obtained in steps 1 and 2 drives the infor- 
mation search process (compliance tests and 
substantive tests) both in terms of the tests per- 


formed and in terms of the conclusions drawn ` 
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from the tests. Recall that the auditor’s knowl- 
edge structure, the schema, both guides infor- 
mation search and represents the current state 
of that search as it derives from the evidence ob- 
tained thusfar.- Therefore, the schema involved 
in step 3 is seen as a prototypical schema for 
auditing that is tailored to fit the special cir- 
cumstances of the particular audit in question. 
This is accomplished through the modification 
and revision of the schema in steps 1, 2 and 3. 
Hence, any particular audit, while similar to 
other audits, is unique because the prototypical 
schema is modified in light of local information. 

Step 4 involves assessing the “fit” between the 
picture acquired in the execution of the audit 
(step 3) and the criteria imposed by the schema 
(steps 1 and 2). Sufficient fit indicates lack of 
material error in the client’s financial state- 
ments. Insufficient fit warrants continued audit 
work or recommendation that the client adjust 
the financial statements. If neither of these 
courses of action produces appropriate results, 
then a qualified opinion must be issued. 


The decision rule 

The foregoing does much violence to the Wal- 
ler & Felix (1984) analysis, primarily by ignor- 
ing its many subtle points and by emphasizing 
some of its aspects that are cogent to our own 
viewpoint. However, the theme is accurate: the 
auditor brings to the audit a sophisticated 
schema for how audits are performed and the re- 
quirements for their successful completion and 
for low risk. The auditor adjusts this schema, 
subject to generally accepted auditing stand- 
ards, to the specific characteristics of the current 
audit. Decisions about accepting or rejecting the 
client, the adequacy of internal controls, the 
types and amounts of evidence to be collected, 
and the final conclusion about the fairness of the 
client’s financial statements are guided by the 
“fit” between the evidence collected through 
the audit process and the demands of the 
schema. 

Notice that in the foregoing argument, the de- 
cision rule is always sufficiency of “fit” rather 
than some version of maximization. That is, the 
rule is non-compensatory — a superabundance 
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` of fit with one part of the schema does not com- 
' pensate for lack of fit with some other part. 
Maximization, and any decision model that im- 
plies or relies upon it, is inconsistent with a pro- 
cess that demands that each and every one of a 
set of minimum criteria be met. Non-compensa- 
tory models, such as the lexicographical model 


(Tversky, 1969): or elimination "by aspects a 
(Tversky, 1972), are more consistent with such 


a demand, but to reduce the processing task to a 
manageable size, both of these models only use 
part of the available information in the decision. 
Labor reduction is not a feature of the Waller & 
Felix analysis — within cost limits, auditors 
clearly attempt to use as much information as is 


required to form an opinion and they are careful . 
not to oversimplify the processing task lest they - 


compromise the integrity of the audit. 


The Waller & Felix (1984) analysis really is’ 


quite bold. It. is, ‘after all, a.profound departure 
from the generally accepted characterization of 


how decisions ‘are made — ie. expected utility - 


maximization,’ tempered ‘by heuristics and 
biases. However, like many bold ideas, the Wal- 
ler & Felix analysis may have been a bit before its 
time. There was no general decision model avail- 
able to which their ideas could be tied. In its ab- 
sence they adopted the schema concept. This is 
a reasonable choice but there are so many kinds 
of schema (scripts, prototypes, stereotypes, 
causal models and; as we shall see, images) that 
the term tends to be distressingly imprecise. Dis- 
comfort about this imprecision sometimes 
causes readers to overlook the other fine qual- 
ities of analyses that use the concept. 


In 1985, Beach & Mitchell circulated a work- ` 


ing paper that outlined a new theory of personal 
decision-making. Over the ensuing two years the 
theory has been refined and extended, and an 
updated version has been published (Beach & 
Mitchell, 1987). In addition, the theory has been 
extended to organizational decision-making 
(Mitchell et al., 1986) and two-person decisions 
(Beach & Morrison, in press). Empirical testing 
has begun (Beach et al., in press; Brown et al., 
1987) and thusfar the theory looks quite promis- 
ing. The following is a brief description of the 
main points of the theory. The original personal 
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decision-making version will be outlined here 
because it is the most easily presented. Then the 
theory will be applied to auditing decisions fol- 
lowing the. lead of the Waller & Felix (1984) 
analysis.’ Finally, some possibilities for research 
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THE SECOND LINE OF THOUGHT: -- 
IMAGE THEORY 


Images are informational representations, 
schemata if you will, that are specific to decision 
behavior.(Miller et al., 1960). They represent: 
the decision-maker’s ideals or principles rele- .- 
vant to'some sphere of decision-making, his or ` 


-her goals in that sphere, what he or she is doing 
' to reach those goals, and his or her view of how 


those efforts are progressing. Decisions are the 
result of the decision-maker’s aero of 
four images: 

(1) The self image, the constituents of which 
are the decision-maker’s basic values, morals, 
ethics, behavioral codes, etc., which are unques- 
tioningly regarded as self-evidently and impera- 
tively desirable. ‘Collectively these are called 
principles. Examples are religious beliefs, busi- 
ness ethics, requisites of good manners and 
standards for social intercourse, and spades: 
tal personal beliefs. 

(2) The trajectory image, the constituents of 
which are the decision-maker’s immediate and 
remote goals, temporally ordered to form an, 
agenda for ‘the future. Examples are landmark 
goals such as getting tenure or more humble 
goals such as finishing a project on time. Candi- 
dates for adoption as new goals are evaluated 
with respect to the constituents of the self image 
and with respect to the existing constituents of 
the trajectory image. The decision rule’ is that if 
candidates are insufficiently compatible with 
the constituents of these two images they are re- 
jected. Otheiwise they. are agopied $ for “ae 
trajectory image. 

Goals are desirable future events and states. 
They frequently are abstract; and their achieve- 


‘Ment is signalled by concrete surrogate events. 


For example, attainment of a long-sought pay 
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raise may be signalled simply by a new set of 
numbers on a pay voucher. Indeed, most of our 
landmark goals (getting a degree, tenure, suc- 
cess) are highly abstract. The concrete surro- 
gates for such goals are called markers. Their 
occurence is an indication that either the goal 
has been achieved or that substantial progress 
has been made toward achieving it. Thus, disser- 
tation committee members’ congratulations are 
a marker for having earned the Ph.D., but are not 
themselves the state of holding the degree. A let- 
ter from the Dean may be a marker for tenure, 
but it is not itself the state of being tenured. 
These markers can be thought of as all-or-none 
criteria. A goal may have many markers, all or 
most of which must occur in order for the goal to 
be regarded as having been achieved. 

(3) The action image, the constituents of 
which are the various plans that the decision- 
maker implements to attain the goals on the 
trajectory image. Examples are teaching well, 
taking on committee assignments, getting 
grants, doing research and publishing papers, all 
in the service.of attaining the tenure goal. When 
a new goal is adopted or when progress toward 
goal attainment is insufficient, a new plan must 
be adopted for the action image. The decision 
rule is that to be adopted, the evaluation of a can- 
didate for the new plan must show it to be com- 
patible with the self image (i.e. it must not seri- 
ously violate the decision-maker’s principles), 
and to offer promise for goal attainment. 

, -~ Plans are general strategies. Tactics are the 
- components of plans. They are the specific acts 
that are performed in the process of implement- 
-ing the plan. Some tactics are fairly well defined 
at the time the plan is adopted, some are less de- 
fined but become better defined as the time for 
their execution approaches. Some are depen- 
dent upon each other or must be executed 
simultaneously, some are contingent upon local 
circumstances when their time comes. When a 
plan is selected, its major component tactics 


often can be anticipated. Indeed, the major tac- ' 


tics are tailored specifically to the goal’s mar- 
kers. 

(4) The projected image, the constituents of 
which are the events and states that the decision- 
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maker anticipates will eventuate if he or she pur- 
sues the current plan. For example, while imple- 
menting the plan to achieve tenure the decision- 
maker may realize that the necessary research 
will not be completed by the time assumed in 
the original plan, thereby threatening goal at- 
tainment. When evaluation of the correspon- 
dence between the projected image and the 
trajectory image shows that the two do not 
match, that they are insufficiently compatible, 
the decision rule requires reevaluation of the 
plans and either replacement by more promising 
plans, or if no plan seems promising, rejection of 
the related goals as unattainable. 


TYPES OF DECISIONS, DECISION RULES AND 
DECISION SITUATIONS 


Image theory requires one to view decisions 
in a slightly different way than is customary. 
Rather than solely being about alternative 
courses of action, decisions primarily are about: 
(1) adopting or rejecting goals and the plans to 
attain accepted goals, and (2) whether those 
plans are making enough progress toward goal 
achievement to warrant: their continued im- 
plementation. These are, called adoption deci- 
sions and progress decisions respectively”... 

Decisions are made either ‘in: terms of com- 
patibility or in terms of profitability. Compati- 
bility is a simple, easy evaluation of the accepta- 
bility of a goal or plan for adoption, or of a plan’s 
progress toward a goal. It means that the candi- 
date goal or plan does not violate the self image’s 
principles, the trajectory image’s goals, or the 
action image’s plans. “Violate” means that some 
aspect of the candidate either fails to express or 
further an image’s constituents, or that it actually 
counters them. At its simplest, when a candidate 
goal or plan is being considered for adoption the 
decision-maker simply tallys the number of con- 
stituents of each image that the candidate viol- 
ates. When (if) this sum exceeds a threshold the 
candidate is rejected, otherwise it is accepted. 
Compatibility also is the measure of acceptable 
congruity between the trajectory and projected 
images. In this case the decision rule is that when 
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(if) the number of violations exceeds threshold, 


reassessment of the plans and goals is instigated. 

Compatibility is conservative and it is non- 
compensatory, in that failure to violate some 
constituents will not compensate for violations 
of others. This means that sufficient compatibil- 
ity is rather difficult to attain and it therefore 
serves to maintain the status quo and avoid 
abrupt and profound changes in the ongoing 
course of the decision-maker’s life. 

Profitability is defined as the degree to which 
a candidate goal or plan offers attractive conse- 
quences over and above compatibility. In adop- 
tion decisions profitability serves to select one 
candidate when there are multiple candidates 
that are compatible with the images; it never 
serves progress decisions. For profitability- 
based adoption the decision rule is not merely 
sufficient compatibility (meeting marker 
criteria), it also includes maximization. That is, 
when deciding among multiple alternatives, the 
subset of alternatives that are sufficiently com- 
patible with the decision-maker’s images are 
then evaluated in terms of their relative profita- 
bility. The decision rule is that the alternative 
with the maximum profitability is selected. Pro- 
fitability is compensatory, and there are numér- 
ous decision strategies that can be selected for 
making this “best choice” decision, among 
which are the various expectancy versions of 
cost—benefit (for discussions of strategies and 
their selection see~Beach & Mitchell, 1978; 
Christensen-Szalanski, 1978, 1980). 

Finally, decision situations are either optional 
or non-optional. Optional decisions are those in 
which it is possible to avoid change and to re- 
main with the status quo. They almost always are 
made on the basis of sufficiency of compatibility. 
Most adoption decisions involve a single candi- 
date as an alternative to the status quo, and they 
are optional decisions. All progress decisions are 
optional decisions. 

Non-optional decisions are those in which 
change is inevitable and it is not possible to re- 
main with the status quo. They almost always in- 
volve multiple alternatives, each of which is a 
candidate for replacing the terminating status 
quo, and. in such cases they are made using 


105 


maximization of profitability and the process 
looks very much like maximization of expected 
utility. 


BRINGING TOGETHER THE TWO LINES 
OF THOUGHT 


Now we turn to translating the foregoing con- 
cepts into terms that relate to auditing. This will 
rely upon the Waller & Felix (1984) analysis, 
and upon a companion analysis by Felix & Kin- 
ney (1982). 

Mitchell et al. (1986) redefined the four per- 
sonal images described above into organiza- 
tional terms (e.g. the organizational self image, 
etc.), and examined the implications of regard- 
ing an organization as a single decision-maker. 
The exercise was interesting, and it was fruitful 
in terms of research ideas (e.g. Beach et al., in 
press). Our intent is to do much the same thing 
for the audit process, assuming that the auditing 
firm is the decision-maker and the decisions are: 
(1) whether to accept or retain a client, and (2) 
whether the client’s financial statements are 
without material error. In this we adopt 
Schandl’s (1978) view that “The purpose of the 
audit is to see if events conform to some desired 
state of affairs . . . [where]... the norms used are 
the image or images of the desired state of affairs 


...” (P. 69). 


The images 

The firm ’s self image consists of the firm’s en- 
tire set of principles, including subsets of busi- 
ness principles (è.g. profits, clientele), subsets of 
acceptable accounting principles, and subsets of 
acceptable auditing standards and techniques 
(appropriate compliance and substantive tests). 
The latter two subsets will be called the firm’s 
audit principles. 

The audit trajectory image consists of multi- 
ple goals, each of which is a correct audit for 
each of the firm's clients. However, within this 


„abstract goal ofa correctness is the concrete fact 


of the client’s financial statements. Thus, the 
concrete goal toward which the audit is oriented 
becomes one of seeing if the audit process sup- 
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ports issuing an unqualified opinion (Le. lack of 
material error in the client’s financial state- 
ments). Sheerly for convenience of exposition, 
we will call the latter a “successful” audit; an “un- 
successful” audit is one in which the audit pro- 
cess does not support an unqualified opinion. 
Each of these two aspects of the concrete goal, a 
successful or an unsuccessful audit, has the same 
markers, if only because every audit must con- 
form to the requirements of the audit principles 
from the firm’s self image. However, each goal 
also is unique to each client in that the markers 
for the different clients have slightly different 
criteria, each set by the unique properties and 
‘circumstances of that client. 

The audit action image consists of the audit 
plans for examining each client’s financial state- 
ments in order to render an opinion about the 
statements. Each plan consists of major tactics, 
individual audit steps, that are aimed at making 
progress toward satisfying the markers’ criteria, 
satisfactory results for an audit step, for each par- 
ticular goal, audit. The plan’s minor tactics are 
more or less assumed. They consist of detailed 
activities such as ascertaining where informa- 
tion is filed, footing columns of numbers, making 
phone calls and all the rest of the day-to-day ac- 
tivities that make a plan work, but are not neces- 
sarily thought out at the time the plan is formu- 
lated. 

The audit projected image consists of the an- 
ticipated success or failure of tactics to meet the 
markers’ criteria. In short, this image reflects 
whether continued implementation of the cur- 
rent plan will allow the auditor to achieve the 
goal of a successful audit forthe client in ques- 
tion. As the audit progresses, ie. as the plan is im- 
plemented through the execution of its compo- 
nent tactics, its anticipated results are assessed 
in terms of their compatibility with the criteria 
necessary for goal achievement. Anticipated suc- 
cess leaves the plan in place. Anticipated failure 
results in adjustments to the tactics, such as re- 
vising the audit steps or increasing the extent of 
substantive testing. If adjusting the tactics does 
not reduce anticipated failure, the plan itself 
must be revised. If plan revision provides no 
remedy, the goal must be rejected. Goal rejec- 
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tion (defining “goal” in the concrete sense dis- 
cussed above) means that the firm cannot affirm 
without qualification that the client’s financial 
statements are free from material error. 


Types of decisions 

Now, let us examine the two audit decisions 
within this framework: first the firm must decide 
whether to accept a potential new client or 
whether to retain a present client. That is, the 
first decision is an adoption decision — whether 
or not to adopt the goal of a successful audit for 
the client in question. This adoption decision de- 
pends upon the compatibility of the client's attri- 
butes with the firm’s audit principles. That is, do 
the client’s attributes violate any of the firm’s 
relevant principles? Information about these at- 
tributes is drawn from records, from inquiries 
made of previous auditors or audits, from 
governmental agencies, etc. (Waller & Felix, 
1984). If the information indicates that the 
client’s “auditability” is high, and that the audit 
also is an appropriate undertaking for the firm in, 
other regards (e.g. advances the firm’s business 
goals), the client will be adopted or retained, 
otherwise it will not. Once adopted, the client 
(goal) becomes a constituent of the firm’s audit 
trajectory image, with its location on the image 
being determined by the deadline for comple- 
tion of the audit. 

The second decision is the error decision, the 
decision about whether the audit evidence sup- 
ports an unqualified opinion about the lack of 
material error in the client’s financial state- 
ments. In Image Theory terms this is a negative 
progress decision. This is a bit complex, so let us 
take it step by step. 

As outlined by Felix & Kinney (1982) and 
Waller & Felix (1984), the first step in making 
the error decision is a preliminary evaluation of 
the client’s internal accounting controls. Here 
again the firm’s audit principles are brought into 
play, except that this time they are used to set 
the marker criteria that must be met before the 
goal can be regarded as having been achieved. 
Poor internal controls prompt more stringent 
criteria in terms of the timing and extent of sub- 
stantive tests. Moreover, it is during this step that 
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the major tactics of the plan begin to take shape. 
The general form of the plan is fairly clear from 
the beginning — it is dictated by generally ac- 
cepted auditing standards and by prototypic 
audit plan developed by the audit firm. How- 
ever, the prototypic audit plan is modified in 
light of the preliminary evaluation of internal 
controls and the resulting marker criteria for 
each client. It is during this step in the auditing 
process that the task requirements begin to be- 
come clear and that the process begins to be 
crafted to fit the unique characteristics of the 
particular client and the environment in which 
the client operates. 

The second step toward the error decision is 
the implementation of the audit plan. Each tacti- 
cal activity is aimed at seeing whether the infor- 
mation that it examines progresses the audit to- 
ward achievement of the various marker criteria. 
When a criterion is met, tactical activity related 
to it stops. If prolonged activity does not pro- 
duce progress toward meeting a marker’s criter- 
ion, the tactic is reviewed and changed if some 
alternative seems more promising. If the new 
tactic produces no progress and no alternative 
to it appears promising, the plan itself must be 
reviewed, If the plan does not appear to be faulty 
or if no alternative can be adopted to replace it 
(perhaps there simply is no way to obtain neces- 
sary information or the client refuses to cooper- 
ate in some important way), the goal must be re- 
jected. 

Goal rejection (i.e. the concrete goal) means 
that the error decision is negative; the firm must 
propose adjustments to the client’s financial 
statements or, lacking client support for adjust- 
ments, it must attach qualifications to its report 
on those statements. A negative error decision 
usually is negative because the plan could not 
produce events that met the markers’ criteria. In 
most cases the plan’s lack of progress accurately 
reflects the gap between the cliént’s financial 
statements and the supporting evidence. Thus 
the plan’s default leads to rejecti Á of the goal of 
a successful audit and to the cofclusion that the 
client’s financial statements “are not materially 
correct, The markers whose criteria are unmet 
indicate where qualification is required. 
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Setting marker criteria 

The markers for a particular audit are addres- 
sed by the substantive and compliance tests that 
are the major tactics comprising the audit plan. If 


‘the tests yield results that meet the criteria for all 


markers, the goal of a successful audit is re- 
garded as having been achieved and the firm can 
issue an unqualified opinion. The question is 
what is meant by criteria and how they are set. 
For markers addressed by substantive tests, 
the criteria are the amounts reported in the 
client’s financial statement. For markers addres- 
sed by compliance tests, the criteria are dictated 
by the firm’s audit principles — its image of ap- 
propriate accounting procedures and internal 
controls. As described by Waller & Felix (1984), 
meeting substantive criteria is of primary impor- 
tance; meeting compliance criteria merely al- 
lows the auditor to rely upon the client’s internal 
controls, thereby allowing reduction in the ex- 
tent and/or timing of substantive testing. 
Theoretically, decisions about whether 
criteria have been met often are described as 
Bayesian, sometimes short-circuited by 
judgmental heuristics. Practically, such deci- 
sions often appear to rest on tests of the null 
hypothesis that the amounts adduced from the 
audit evidence are not significantly different 
from the financial statements. Probably neither 
of these accurately reflects how the decisions ac- 
tually are made. The process of setting criteria 
and making decisions about whether criteria 
have been met cannot properly be described 
using statistical concepts, because both are influ- 
enced by variables that have no counterparts in 
Statistics. Although a parallel exists between 
these activities and what statisticians do, the re- 
semblance is superficial and can be misleading. 
In his lexicographic model of decision-mak- 
ing, Tversky (1969) introduced eta, a threshold 
of allowable dissimilarity between two options 
on a descriptive dimension. The idea is that if the 
two options are within eta of each other on the 
dimension the difference between them is im- 
material, so the dimension is dropped from the 
decision. This idea was expanded upon by Beach 
& Solak (1969) and labeled the equivalence in- 
terval, the EL In this and other research (Beach 
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et al., 1974; Crocker et al., 1978; Christensen- 
Szalanski, 1978, 1980; Laestadius, 1970; Larson 
& Reenan, 1979), the EI evolved as the 
maximum difference that can exist between 


some reference point and a derived point that ` 


still permits the two to be regarded as essentially 
the same, which also is the definition of material- 
ity.in auditing. For the EI the reference point 
may be some external value, say the “correct” 
quantitative answer to some question. The de- 
rived point may be a subject’s intuitive answer 
or an answer he or she derives from some data 
gathering process. The EI is the maximum differ- 


ence between the two at which the subject con- 


tinues to regard his or her answer as essentially 
correct. When subjects were asked to designate 
Els it was found that many variables, some statis- 
tical and some not, influenced the breadth of the 
Els. 

For example, Els are influenced by the famil- 
iarity of the task — sometimes they are narrower 
(more demanding) if task familiarity can be ex- 
pected to lead to superior performance (Beach 
et al., 1974), or they are wider if familiarity 
shows the task to be more difficult than at first as- 
sumed (Beach & Solak, 1969). They are influ- 
enced by social equity —- wider when equity 


demands leniency and narrower when it: 


demands stringency (Beach et al., 1974). They 
are narrower when the method used to derive 
the answer to the problem is perceived to be 
precise, and wider when it is not (Christiensen- 
Szalanski, 1978). They are wider for small sam- 
ple sizes than for large ones (Beach et al., 1974; 
Laestadius, 1970), and are wider when these 
samples come from populations with large var- 
dance than when it is small (Laestadius, 1970), 
both of which conform to statistical theory. But, 
they are wider for judgments about large mag- 
nitudes (what is the national debt?) than for 
judgments about small magnitudes (how much 
change is in your pocket?), irrespective of var- 
iance (Beach & Solak, 1969; Laestadius, 1970), 
which does not conform to statistical theory. In 
short, Els behave something like credible inter- 
vals, but then again they do not; they are influ- 
enced by not unreasonable variables such as fair- 
ness, precision and magnitude, that are irrelev- 
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ant to statistical credible intervals. 

From an Image Theory viewpoint, a marker 
criterion is the specific, precise reference point 
on a marker and the EI indicates the precision 
with which the audit evidence must match that 
precise point in order for the criterion to be re- 
garded as having been met. The preliminary 
evaluation of the client’s internal controls sets 
EIs for criteria on each of the markers relevant to 
the audit — both compliance markers ‘and, 
through them, substantive markers. Of course, 
these Els can be revised throughout the audit 
process as additional information is gathered. 

The point is, the audit evidence is not used in 
a strictly Bayesian manner, or even in a heuristic 
manner, nor are statistical tests really very ger- 
main to the error decision. In fact, the auditors’ 
judgments of what constitutes “close enough”, 
materially correct, are determined by many non- 
statistical variables. On the other hand, these 
judgments govern the decision about when a 
criterion has been met and when there has been 
progress toward achieving the goal, so they are 
important to the understanding of audit deci- 
sions. 


SOME RESEARCH QUESTIONS 


Readers who are familiar with the auditing re- 
search literature probably can cite existing 
studies that address many of the points discussed 
here. However, because this is not a review 
article, we will restrict ourselves to outlining 
some of the more apparent research questions 
that arise from our analysis. 

The questions fall into two categories, those 
involving images and those involving the im- 
plementation of the audit and the decisions that 
result from the audit. 


Questions about images 

The first question is about firms’ self images. 
What are the constituents of these images, how 
do they differ from one kind of firm to another, 
and what is the overlap among subsets of the 
constituent principles? We have found that for 
both personal self images (Brown et al., 1987) 
and organizational self images (Beach et al., in 
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press) it is possible to discover the constituent 
principles. Moreover, we have found in both 
cases that it is possible to use knowledge of these 
principles to predict subsequent decisions. 


_ Auditing firms present an especially interesting - 


setting for such research because while each 
firm is unique, they nonetheless are quite similar 
in many ways. That is, some of their principles 
are unique to the business environment in 
which each of them operates, but other princi- 
ples are similar across firms because of the 
strong norms and guidelines for the auditing 
profession. To be credible, research findings 
must reflect this diversity and similarity. 

A second question about firms’ self images 
concerns the degree to which ethical principles 
pervade the various spheres of the firms’ ac- 
tivities. That is, are ethical principles applied 
more directly in the evaluations of clients than in 
the determination of the firm’s own activities? 
Do firms’ ethical principles (and one assumes, 
their practices) reflect the instruction, or lack of 
it, received in the programs in which the 
‘partners got their training? How are ethical prin- 
ciples, as well as the other constituent princi- 
ples, transmitted to newcomers to the firm, i.e. 
how does acculturation take place? 

A third question involves the image of the 
“successful” audit and its role in audit decisions. 
Even though a correct opinion is the abstract 
goal, if an unqualified opinion is the concrete 
goal toward which the audit is oriented, what 
sort of biases does this introduce? It is widely ac- 
cepted that anchor points have an effect on sub- 
sequent decision-making, although the effect to 
. be expected is not always clear. In the present 
case the audit is seen as having to marshal evi- 
dence to meet the marker’s criteria in order to 
achieve the goal of a successful audit. In this 
view the thrust is from doubt and ignorance to- 
ward affirmation, which according to the 
anchoring and adjustment hypothesis (Tversky 
& Kahneman, 1974) might induce a bias against 
goal attainment. On the other hand, one could 
construct a case for the effects of the thrust, the 
striving toward the goal, producing the opposite 
bias. It really is not at all clear which bias, if any, 
is to be expected, and that is why research is 
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needed. 

A fourth question involves the formulation of 
audit plans. A plan can be thought of as a 
hypothetical scenario about what will be re- 
quired to reach a goal. As Jungermann (1985) 
has pointed out, the structure of scenarios are 
different depending upon whether they are con- 
structed backward or forward. That is, if con- 
struction begins with the endpoint and builds 
backward to the present, the scenario tends to 
feature the steps adjacent to goal attainment. Ifit 
builds forward from the present toward the goal, 
it tends to feature the steps necessary for getting 
started. Of course, audit plans are more highly 
prescribed than are the plans (scenarios) for 
many other kinds of decisions. However, the 
question still remains, does it make any signifi- 
cant difference in the audit plan if it is formu- 
lated backward or forward? 

The final question about images involves how 
projected images are created. This is a general 
question for Image Theory, but auditing is a par- 
ticularly appropriate area in which to investigate 
it. The projected image is the anticipated results 
of the audit at any moment during the audit. The 
concept assumes a mechanism for bridging the 
gap between that moment and the future so that 
progress can be assessed. The question is how 
bridging takes place. Is it accomplished, for 
example, merely by extrapolating the present 
moment in some simplistic linear fashion? Prob- 
ably not. Is it accomplished by constructing a 
story about how events might unfold to form a 
path from here to there? Perhaps in some cases,. 
but that is more cumbersome and more time 
consuming than introspection suggests is usu- 
ally necessary for assessing progress. We admit 
that we do not know the answer, but we submit 
that an answer is needed. It is needed not just for 
Image Theory or for auditing. It is needed for the 
understanding of a phenomenon that we all ex- 
perience, the ability to anticipate the future and 
to act upon that anticipation. 


Questions about implementation 

The first question about implementation actu- 
ally is about the prior issue of how clients are 
adopted or retained. Because clients usually are 
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considered one by one (it seldom is a question of 
this client or that one), their adoption or reten- 
` tion is an optional decision. Recall that optional 
decisions are those in which the firm has the op- 
tion of doing nothing at-all and staying with the 
- Status quo. In this case staying with the status 
quo means either not taking on the new client or 
retaining the old client. Research shows that in 
optional decisions the status quo tends to be 
favored over change (Mitchell et al., 1986). This 
suggests that, business concerns aside, firms may 
tend to be biased against accepting slightly in- 
compatible new clients as well as being biased 
toward retaining slightly incompatible old 
clients. The possible paradox is that, all else 
being equal, a firm might retain a client that it 
would not adopt were it a new client and it 
might reject a new client that it might retain 
were it an old client. A corollary of this may be 
that because progress decisions also are optional 
decisions, there may be a bias toward staying 
with an unsatisfactory audit (the status quo) 
longer than an outside observer would recom- 
mend, perhaps pouring even more resources 
into it in an effort to make it work (e.g. Staw, 
1981). 

The second question about implementation 
involves how evaluations of the client’s internal 
controls influence the Els for criteria and the 
structure of the audit plan. Clearly, the course of 
the entire audit is conditional upon the quality of 
the client’s internal controls. During preliminary 
evaluation the apparent quality sets preliminary 
Els, thus determining the major tactics of the 
audit plan. If, as the audit proceeds and com- 
pliance tests are performed, the preliminary 
evaluations of the controls are revised, there 
must be a complementary revision in the Els and 
in the tactics that comprise the plan. It is tempt- 
ing to couch these revisions in Bayesian terms, 
but as explained above, this will not do. There 
are too many non-statistical considerations — 
from hard data to gossip — at work in these revi- 
sions. Perhaps a better starting place would be 
the attitude change literature. While Els may not 
be.attitudes in the strict sense, they are not very 
different. They have an evaluative aspect and 
they have strong implications for behavior in 
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that they influence both the formulation (and ré- ; 
formulation) of plans as well as the criteria that 
those plans strive to achieve. Hence, attitude 
change may be an appropriate perspective for 
thinking about EI revision. 

The last research question about implementa- 
tion involves how it terminates in decisions. Ig- 
noring the complications implied by the ques- 
tions raised above, the adoption/retention deci- 
sion about clients is more straightforward than 
the materiality decision. It is customary to think 
of the former as a routine business decision 
dominated by financial considerations. How- 
ever, this.may reflect a stereotype of how busi- 
ness is done rather than knowledge about how 
such decisions actually are made. Certainly the 
firm’s profit is a large factor in the decision, and 
a firm that needs business may let it dominate. 
But, principles other than solvency also may play 
a large role in adoption/retention decisions, and 
it is important to examine their contributions. 

Of course, the error decision is our primary in- 
terest. The Image Theory description of how this 
decision is made is quite different from the usual 
cost-benefit, expected utility description. In the 
present analysis the decision actually is a deci- 
sion about progress toward the goal, rather than 
about the goal itself. Insofar as costs, benefits, 
probabilities or utilities are involved at all, they 
influence the breadth of the Els for the marker 
criteria rather than being properties of or out- 
comes of the goal. No doubt some readers will 
regard this description as distressingly unpar- 
simonious. However, a parsimonious descrip- 
tion of unparsimonious events may be a Pyrrhic 
victory of style over substance. 

Of course, because of its primacy, all of the re- 
search questions outlined thusfar have implica- 
tions for the error decision. However, additional 
questions involve the markers that comprise the 
goal, the manner in which tactics are designed to 
address these markers, whether some markers 
are primary to the goal and others are secondary 
or whether all are of equal status. Finally, there is 
the question of how compliance tests contribute 
to the goal; is the contribution direct or is it 
through influencing the criteria that must be 
met by the substantive tests, as implied by the 
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Waller & Felix (1984) analysis? 
SUMMARY 

We have presented an Image Theory interpre- 
tation of audit decision-making based upon 
analyses by Waller & Felix (1984) and Felix & 
Kinney (1982). The interpretation is quite diffe- 
rent from that provided by conventional deci- 
sion theory and it is offered as an alternative to 


the usual way of thinking about auditing. The de- 
cision-maker is seen as having four knowledge 


representations (images) and decisions are’ 


made in the course of managing them. Principles 
on the self image serve as criteria for decisions 
about adopting goals for the trajectory image. 
The latter is the agenda of desirable events and 
states for which the decision-maker strives. 
Plans for reaching these goals are adopted with 
regard for the principles and goals. Plans are the 
constituents of the action image. The projected 
image consists of the events and states that are 
anticipated to eventuate if the plans are pursued. 

Decisions address the adoption of candidate 
goals or plans, and abandonment of plans that fail 
to produce progress toward goals. Adoption de- 
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cisions are determined by whether there is suffi- 
cient compatibility between candidates and the 
constituents of images to warrant adoption of 
the candidates. Progress decisions are deter- 
mined by whether there is sufficient compatibil- 
ity between the trajectory and projected images, 
the desired and anticipated future, to warrant re- 
tention of the present plans. (Profitability some- 
times plays a role in these decisions, but not as 
consistently as compatibility.) When it is an op- 
tion to do so, decision-makers tend to remain 
with the status quo. 

In terms of audit decisions, the firm is re- 
garded as a unitary decision-maker and there are 
two decisions. The first decision is about 
whether to accept a client’s business. There is an 
optional adoption decision. The second is the 
error decision. This is an optional progress deci- 
sion. The decision is negative if the audit is un- 
able to progress toward an unqualified affirma- 
tion of the fairness of the client’s financial state- 
ments. That is, contrary to the usual way of think- 
ing about it, in the Image Theory view, the error 
decision is a consequence of the inability of the 
audit process to sufficiently remove doubt about 
material error rather than a consequence of pro- 
ving that there is such error. 
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Abstract 


This article presents a critical review of the existing research literature on expertise in auditing and 
explores useful avenues for future research. The review is organized around the two main approaches that 
have been used to study expertise in auditing — the behavioral and the cognitive approaches. The concept 
of expertise within these two approaches is examined. Results from studies using the behavioral approach 
indicate that expert auditors do not behave differently from novice auditors. Possible reasons for this lack 
of significant difference are discussed in the article. Results from studies using the cognitive approach are 
more encouraging. They indicate that there may be knowledge differences between expert and novice 
auditors and that these differences might lead expert auditors to use decision processes that differ from 
those used by novice auditors. Our knowledge about expertise in auditing is, however, still embryonic and 


there is a need for more research. 


In 1959, Goldberg compared clinical psycho- _ 


ogists and secretaries in making diagnoses of 
brain damage. His results were astonishing: clin- 
ical psychologists were no better than sec- 
retaries! Although not as striking, results from 
auditing research on expertise in the last ten 
years also reveal few differences between expert 
and novice auditors in decision-making. 

The purpose of this paper is to provide a criti- 
cal review of the existing research literature on 
expertise in auditing. The objectives of review- 
ing this literature are to evaluate the state of 
knowledge on expertise in auditing and to 
examine useful avenues for future research. 

The paper is organized into four sections. 
First, the concept of expertise in auditing, the 
behavioral sciences and ‘cognitive psychology is 
examined. Then, selected research studies on 
expertise in auditing are reviewed. The paper 
concludes with a presentation of what could be 
useful avenues for future research. 


EXPERTISE 


Complexity of the subject matter and the audit process 
indicate the requirement of competence on the part of 
the auditor (American Accounting Association, 1973, p. 
17). 


The essence of all professions — including 
public accounting — lies in the expertise of its 
members. Recognising that some complex tasks - 
require special competence, society may license 
performance of those tasks exclusively to desig- 
nated professionals. Professionals have “a unique _ 
knowledge-set often represented by skills” 


` (Buckley, 1978, p. 65). A characteristic of the 


auditing profession is then a unique knowledge- 
set or expertise. The profession regulates the 
minimum level of knowledge necessary to be a 
certified public accountant (CPA). 

Being a CPA does not mean that one becomes 
an expert overnight. Expertise is not an all-or- 
nothing attribute. The profession sets entry- 


*I would like to acknowledge the helpful comments from Theodore J. Mock, Barry L. Lewis and the seminar participants at 


the University of Waterloo. 
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level standards, but within the firm there are 
finer gradations based on experience and 
specialization. More specifically, most public ac- 
counting firms have a pyramidal structure with 
_ the more experienced people at the top. Within 
levels of experience, firms also have people who 
specialize in areas such as taxation and compu- 
' ter auditing. Specialists normally have more ex- 
pertise in a particular area and less than the gen- 
eral practitioner in other areas (CICA, 1982). 
From the above, it appears that auditors pos- 
sess a certain level of knowledge and abilities. 
Also, some auditors might possess more of these 
than others. 


Definition of expertise 

In order to investigate the concept of exper- 
tise in auditing, it needs to be defined and 
operationalized. At present, there exists no gen- 
erally accepted definition or measure of exper- 
- tise. Webster's Ninth New Collegiate Dictionary 
(1983) defines expertise as “the skill of an ex- 
pert”, “expert” being defined as “having, involv- 
ing, or displaying special skill or knowledge 
derived from training or expetience.”, 
Psychologists and knowledge engineers have 
also suggested definitions of expertise. Similar to 
Webster’s definition, these definitions refer to 
the concepts of knowledge and skill/perform- 
ance. For example, Hayes-Roth et al. (1983, p. 4) 
define expertise as consisting of “knowledge 
about a particular domain, understanding of do- 
main problems, and skill about solving some of 
these problems.” 

Since it is impossible to observe expertise 
directly, the concept of expertise should be 
operationalized with observable variables, such 
as years of experience. In auditing, research 
studies almost routinely collect information re- 
lating to years of audit experience that are used 
as a surrogate for expertise. An assumption un- 
derlying these studies is that the number of years 
of experience is a major determinant of exper- 
tise. This might not always be true. One can have 
considerable experience and not be an expert. 
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Drinking twenty-four beers a day may make one 
experienced, heavy and drunk, but it will not ' 
necessarily make one an expert on beer (Jacoby 
et al., 1984, p. 469). 

Experience must also be related to the task 
because expertise is domain specific (Eilstein et 
al., 1978, Larkin et al., 1980). For example, an 
auditor may have twenty years of auditing ex- 
perience, but not exhibit more “expertise” in 
evaluating internal control than a senior 
because, of these twenty years of experience, 
only three involve evaluating internal control 
systems. Finally, experience may not necessarily 
lead to better judgment and more expertise 
because people may have a number of biases that 
prevent them from using information that ex- 
perience provides (Brehmer, 1980; Waller & 
Felix, 1984b). 

Other approaches, such as peer rating (Elstein 
et al., 1978; Shpilberg & Graham, 1986) and de- 
cision quality (Jacoby et al., 1984), have been 
used to measure expertise in other fields, but 
they have problems too. Peer rating may be 
based on other attributes (e.g. personality) not 
necessarily related to expertise.' In auditing, de- 
cision quality is difficult to evaluate because it 
offers few areas in which objective criteria exist 
for evaluating the quality of auditor decision 
making. ‘Also, an auditor may make an objec- 
tively good decision, but for the wrong reason, 
or vice versa. 

Expertise is thus difficult to operationalize 
because we do not know exactly what expertise 
is and it is a complex concept that cannot be 
completely accounted for by one measure such 
as number of years of experience. More than one 
measure may be needed. Such an approach has 
been used by Bédard (1986b) who combined 
four variables related to practical experience 
and education into a global measure of expertise 
using a factor analysis approach. 

Identifying the expert is an important prob- 
lem, both for the auditing researcher and the 
practitioner. In order to study expert knowledge 
and problem solving behavior, researchers must 


‘According to Shanteau (1984) attributes such as self-confidence and ability to communicate that confidence may be central 
to defining expertise. These attributes might be important in auditing where auditors may have to defend their decisions in 


court. 
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first identify the expert(s). The problem is simi- 
lar for the practitioners. As indicated by Shpil- 
berg & Graham (1986, p. 92), “it is difficult to 
identify a knowledge ‘czar’ whose estimates, 
processes, or knowledge are clearly superior to 
what the system and mix of staff, support tools, 
and consulting skills produce.” 

Before reviewing the research studies on ex- 
pertise in auditing, the two main approaches to 
expertise are briefly examined. 


Behavioral view of expertise 

The behavioral view of expertise is based on 
Einhorn’s (1974) paradigm. When performing a 
task, the decision-maker identifies the informa- 
tion cues, measures the level of the cues, or- 
ganizes information into clusters and weights 
and combines the cues to form a global evalua- 
tion. Based on this model of the judgment pro- 
cess, Einhorn suggests three necessary, if not suf- 
' ficient, conditions for expertise: 

(a) Experts should tend to cluster variables in 
the same way when identifying and organizing 
cues. 

(b) In measuring the amount of the cue, the 
experts should show high intrajudge reliability, 
high interjudge reliability and be relatively free 
of judgment bias. 

(c) The experts should weight and combine 
the cues in similar ways. 

By presenting these three conditions, 
Einhorn’s goal was to use more objective criteria 
to define an expert than just defining an expert 
as one who is very skillful with training and 
knowledge in some specialized field. This view 
of expertise has led to many research studies 
that have applied these criteria to data obtained 
from subjects with various levels of expertise. 


Cognitive view of expertise 

Cognitive psychologists explain expertise in 
terms of knowledge. This knowledge is obtained 
through direct experience (past judgmental de- 
mand and performance feedback) and indirect 
experience (e.g. education). Knowledge is often 
separated into two categories — public or pri- 
vate. Public knowledge consists of facts, theories 
and definitions from textbooks and journals. Pri- 
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vate knowledge consists of rules of thumb’ 
(heuristics) that are developed through direct 
experience. 

In medical problem-solving, Feltovich (1981) 
describes the nature of knowledge change as fol- 
lows. First, a physician starts with a small set of 
common and over-simplified rules of evaluation 
learned from textbooks and classroom instruc- 
tion (public knowledge). Through direct ex- 
perience, new disease models are added to 
memory; the number of defining attributes of 
the diseases are increased and diseases are clus- 
tered into sub-categories. This finer categoriza- 
tion aids the development of precision in expec- 
tation and of precise rules for evaluating patient 
data. 

Results from studies in cognitive psychology 
indicate that, compared to non-experts, experts 
have more complete knowledge, better cross- 


referencing and memory organization (Chase & 
‘Simon, 1973; McKeithen et al, 1981), and 
‘superior mechanisms for relating problems with 


the appropriate knowledge and course of ac- 
tions (Feltovich, 1981). Experts also have the 
capacity to generate an action in a variety of situ- 
ations; including some that have not been ex- 
perienced or even anticipated beforehand 
(Johnson, 1986). 

‘Johnson (1986) makes the distinction be- 
tween expertise and expertness, expertise being 
related to the capacity of generating actions 
while expertness is the performance of a task 
correctly with proficiency and efficiency. An ex- 


pert auditor will possess both expertise and ex- 


pertness in carrying out a task. However, a new 


‘auditor may display expertness in carrying out 
_an auditing task while not possessing the exper- 
` tise of those who designed the sequence of activ- 


ity he has mastered. The new auditor will display 
expertness because in the CPA firms, knowledge 
is distributed within the firm through the use of 
specialized questionnaires, decision aids and 
training sessions. 


Distinction between the bebavioral and cogni- 
tive views 

The behavioral view of expertise focuses on 
the output of the decision process and does not 
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‘pay attention to the differences between the 
cognitive processes of expert and novice 
auditors and the way these processes might in- 
' fluence their judgments. Studies using this 
approach focus on the effect of expertise on 
parameters of judgment such as consensus, sta- 
bility, self-insight and cue importance, without. 
trying to understand why and how experts may 
show more consensus or make decisions that 
differ from those made by novices. 

In contrast, the cognitive view of expertise 
focuses on cognitive processes and the knowl-. 
edge base underlying the behavior of experts 
and novices. The objective is to understand how 
experts make decisions. Studies using this 
approach focus on the effect of expertise on 
memory content and organization, decision pro- 
cesses and their interrelation with decisions. 


Summary 

Expertise is an elusive concept. It is not well 
understood and consequently, it is difficult to. 
define and operationalize. Depending on the ap- 
proach used, expertise may be defined in terms 
of knowledge; decision process, and quality of 
judgment as measured by consensus. Because of 
these ambiguities in the concept of expertise, it 
is difficult to identify who is an expert or who is 


more expert. 


BEHAVIORAL STUDIES OF EXPERTISE IN 
AUDITING 


The nature of expertise has been addressed in 
several recent auditing research studies. Insome 
of these studies, expertise was the central aspect 
of the study (Waller & Felix, 1984b; Weber, 
1980; Hamilton & Wright, 1982; Messier, 1983; 
Nanni, 1984; Bédard, 1986a). This section pre- 
sents a review of selected behavioral studies of 
expertise in auditing. 

Behavioral studies on expertise in auditing 
have traditionally focused on the effect of exper- 
tise as measured by years of experience, on 
parameters of judgment (consensus, stability, 
_ Self-insight, importance of cues) and on the judg- 
ment itself Most of the behavioral studies are 
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based on the Brunswick lens model. They in- 
clude Ashton (1974a,b), Joyce (1976), Hamil- 
ton & Wright (1977), Reckers & Taylor (1979), 
Ashton & Kramer (1980), Ashton & Brown 
(1980), Hamilton & Wright (1982), Gaumunitz et 
al. (1982), Tabor (1983), Messier (1983), 
Nanni (1984) and Boritz et al. (1987). 

The typical lens model study uses a set of cases 
containing five to eight information cues. The 
cases are generated according to a half factorial 
design. The variables that are examined in these 
studies are the level of consensus among sub- 


l jects in their decisions, the stability of judgment 


over time, the insight that individuals possess 
into their own judgment process, the proportion 
of variance explained by the different cues and 
the effect of experience on these variables. Table 
1 gives a summary ofthe results related to exper- 
tise for these studies. 

Other studies not based on a lens model in- 
clude Weber (1978), Mock & Turner (1981) 
and Bédard (1986a). In these studies, the 
auditors worked with a fairly complete set of 
working papers and were asked to make audit 
planning decisions. For the variables listed in 
Table 1, only the effect of expertise on the judg- 
ments was examined by Weber and by Mock & 
Turner. Weber found a significant effect while 
Mock & Turner no significant effect. Bédard 
looked at the effect of expertise on consensus 
and judgments. He found that there was more 


` variability in the judgments of the experts than 


the novices. There were also significant differ- 
ences in the judgments of the two groups (ex- 
perts and novices). 

The main results for the effect of expertise on 
parameters of judgment and on the judgments 
are now summarized and discussed. 

(1) Consensus. This measure has been exten- 
sively used as a measure of the quality of audit 
decision-making. It is based on Einhorn’s (1974) 
proposition that experts should show high inter- 
judge reliability and the fact that “the auditing 
profession engages in several efforts to promote 
consensus” (Ashton, 1983, p. 10). 

The proponents of consensus argue that lack 
of consensus affects audit efficiency and effec- 
tiveness and that it may also affect accuracy 
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(Ashton, 1983; Mock & Turner, 1981; Joyce, 
1976). Lack of consensus might then be an indi- 
cation of a problem, but it does not necessarily 
indicate where the problem is, i.e. who made the 
incorrect decision. Also, when high consensus is 
found, it is possible that subjects have all agreed 
on the wrong decision (Biggs, 1982). 

Of the ten studies that have examined the rela- 
tionship between consensus and expertise, a 
negative association was found in three of them 
and a positive association in three of them. In 
most cases, the association between consensus 
and expertise was small. For example, the corre- 
lation found by Hamilton & Wright (1982) be- 
tween experience and consensus is —0.20. 

The results from the studies are distributed al- 
most equally between the three possibilities (no 
association, positive association and negative as- 
sociation), indicating that the three possibilities 
are equally probable (Pearson’s chi-square = 
0.20, p = 0.97). Thus, it can not be concluded 
that consensus increases with expertise level. 
This result, plus the fact that when a relationship 
was found it was weak, seems to indicate that 
there is no meaningful relationship between 
consensus and expertise. 

(2) Stability. This variable refers to the stabil- 
ity of judgment over time or over repeated trials. 
Stability is hypothesized to be an indicator of the 
quality of audit decision making because un- 
stable judgments have a detrimental effect on ac- 
curacy (Ashton, 1983). Stability has the same 
drawback as consensus: a low level of stability 


may indicate a problem, but does not indicate 


which decisions are incorrect. Also, a high level 
of stability does not imply a correct decision. 
All five studies which have examined the rela- 


tionship between expertise and stability found 


no association between these two variables. 
There is then strong evidence that stability of 
judgment does not vary with experience. | 

(3) Self-insight. This measure refers to the 
auditor’s understanding of his own decision pro- 
cess as represented by a statistical model of that 
process. It is considered important to have a 
high degree of self-insight because auditors 
often have to explain and discuss their judgment 
with others. 
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Of the seven studies which have examined the. 
relationship between expertise and self-insight, 
all except Hamilton & Wright (1977) found no 
relationship between these two variables. Based 
on this evidence, it seems that expert auditors 
do not have a better understanding of their own 
decision process than non-experts. Results in 
cognitive psychology also indicate that, for 
routine judgments, expert auditors might actu- 
ally have less self-insight because their judg- 
ments may not be conscious (Anderson, 1982; 
Gibbins, 1984). 

(4) Cues importance. In the behavioral 
approach, cue importance is indicated by the 
parameters from the statistical model of the 
auditor’s decision process. The relationship be- 
tween experience and cue importance was 
examined in five studies of which only two 
found a small relationship between experience 
and cue importance. These results suggest that 
experience is not related to cue importance, if it 
is, the relationship is not strong. 

(5) Judgments. Eight of the studies also 
examined the effect of expertise on the actual 
decisions of the auditors. All except Mock & 
Turner (1981) found a significant difference in 
the judgments. It is, however, difficult to draw 
any conclusions on the direction of the differ- 
ences since, in some studies, less experienced 
auditors proposed more audit work than expert 
auditors and in other, they proposed less audit 
work. For example, Weber (1978) found that 
the estimated number of hours required to com- 
plete the inventory audit was positively related 
to degree of experience while Reckers & Taylor 
(1979) found that less-experienced auditors 
tended to “over-audit”. 

These contrary results might be caused by the 
differences in the experimental task (internal 
control evaluation, audit planning, materiality) 
or by other factors not controlled for in these 
studies. For example, Tabor (1983) found that 
less experienced auditors selected a smaller 
sample size than experienced auditors. This re- 


sult might be caused by a higher reliance on in- 


ternal control, a higher materiality threshold, 
costs/effectiveness considerations, or they 
might result from differences in the perception 
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of the relationship between internal control and 
the extent of audit testing, It is then very difficult 
to. arrive at any conclusion on the direction of 
the relationship or on the correctness of each 
group decision. However, it seems that having 
more expertise might affect the auditor’s deci- 
sion. 


Heuristic and biases 

Another group of studies that can be classified 
under behavioral studies is the heuristic and 
biases literature or, as von Winterfield & Ed- 
wards (1986) call it, the cognitive illusions liter- 
ature. Three judgment heuristics identified by 
Tversky & Kahneman (1974) have been studied 
by auditing researchers: judgment by represen- 
tativeness, judgment by availability and judg- 
ment by adjustment. Although they are efficient, 
these heuristics can lead to systematic biases 
such: as ignoring base rates, ignoring sample 
sizes, and insensitivity to the predictability of 
data.’ 

To the author’s knowledge, no study has tried 
to look at the effect of expertise on these biases. 
These biases, however, seem to be present 
across a wide range of subjects from students to 
statistically savvy psychologists (Slovic et al., 
1977). In auditing, research studies indicate that 
auditors have difficulty in understanding the im- 
plications of sample information and can also 
make these biases. But there is not enough evi- 
dence to make specific comparisons between 
auditors’ biases and those found for students in 
the psychological literature. 


Discusston 

From this review of the results of behavioral 
studies, it appears that expertise does not have a 
systematic effect on the parameters of judgment 
and on the judgment itself. Only the judgment 
seems to be affected by expertise, but the effect 
is not systematic. Some possible explanations for 
these results are: (1) there are methodological 
weaknesses in the studies, (2) the theories 
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suggesting that expertise may affect these vari- 
ables are not valid, or (3) expertise does not 
exist in auditing. 


Methodological weaknesses in the studies. 
One possible explanation for the lack of signifi- 
cant results is that the range of experience levels 
of the subjects was too narrow. Hamilton & 
Wright (1982), however, used a broader range 
of experience level and a larger percentage of 
experienced auditors and found no significant 
difference either. Another possible explanation 
is that, of all the auditors’ years of experience, 
only the first years of experience involve making 
the audit planning decisions examined in most 
of these studies (e.g. evaluating internal control 
systems). Some studies have, however, 
examined the effect of specific experience 
(Nanni, 1984; Bédard, 1986a) and have obtained 
similar results. Finally, the high level of structure 
in the decision task may explain the general lack 
of effect. Expertise may have its greatest impact 
in complex, ill-structured problems. However, 
with more than 13 studies (with their own weak- 
nesses) all pointing in the same direction: no sys- 
tematic differences, it appears that methodolog- 
ical weaknesses might not be a valid reason for 
the lack of significant results. 


Underlying theory. The theories suggesting 
that expertise may affect these variables may not 
be valid. For example, Simon (1979, p. 42) states 
that 


We must expect to find different systems using quite dif- 
ferent strategies to perform the same task. I am not aware 
that any theorems have been proved abot the uniqueness 
of good, or even best, strategies. Thus, we must expect to 
find strategy differences not only between systems at dif- 
ferent skill levels, but even between experts. 


Expert auditors may then use different proces- 
ses to make their judgments, weight the informa- 
tion cues differently, and may not necessarily ar- 
rive at more consensus than novices.’ Experts 


Bor a detailed review of these heuristics and of the auditing studies, see Libby (1981) and Ashton (1983). 

31f consensus is examined from an expected utility framework, differences in consensus between expert and novice auditors 
might be caused by differences in their probabilities estimates and/or by differences in their utilities (Lewis, 1980). Within 
that framework, studying the effect of expertise on the two components of the decision model should help understand the 


effect of expertise on consensus. 


EXPERTISE IN AUDITING 


may have less consensus because of the differ- 
ences in their direct experiences. On the other 
hand, novices may have more consensus be- 
cause they have common knowledge from text- 
books and classroom education. 

Also, results from cognitive psychology indi- 
cate that some experts’ judgments may not be 
conscious since they are routine. For these 
routine judgments, it might be expected that ex- 
perts would have less self-insight than non- 
experts. 

The evidence accumulated from behavioral 
studies indicates that expert auditors do not be- 
have differently from novice auditors. This con- 
clusion seems counter-intuitive and will prob- 
ably be very difficult to defend before an auditor 
who has studied for years, is still taking training 
courses, and is working very hard to get more 
experience and become a partner. This conclu- 
sion is, however, dependent on the way exper- 
tise is defined in the studies and on the attributes 
of behavior that were examined. Cognitive 
studies, which look at different attributes of ex- 
pertise, indicate that there are differences be- 
tween expert and novice auditors. These studies 
are now reviewed in the next section. 


COGNITIVE STUDIES OF EXPERTISE IN 
. AUDITING 


The use of the cognitive approach has blos- 
. somed recently in auditing. In fact, all of the 
studies reviewed were conducted in the 
eighties. Cognitive studies on expertise in audit- 
ing have focused on the expert’s knowledge and 
its role in professional judgment. Both theoreti- 
cal and empirical studies have been performed 
by auditing researchers. Various methods such 
_ as cue recall, process tracing (verbal protocol, 
information retrieval techniques) and expert 
systems have been used to study expert judg- 
ment. 


` Theoretical studies 

Based on results in other fields, some auditing 
researchers have conjectured about the expert’s 
. knowledge and its role in audit decision-making 
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(see Libby, 1981; Gibbins, 1984; Waller & Felix, 
1984a,b). 

Gibbins (1984) provides 21 propositions 
about professional judgment in accounting. Of 
these, three are related to the fudge’s experi- 
ence. According to Gibbins, experience creates 
a knowledge structure, consisting of a system of 
schematized and abstracted knowledge, which 
is maintained in long term memory. This knowl- 
edge structure provides a guide to the judgment 
process and the response for situations that arise 
in the audit. The attributes of the knowledge 
‘structure, such as availability, level of detail, 
cuing and organization, are shaped by past 
judgmental demands and performance feed- 
back. 

Waller & Felix (1984b) focus on the effect of 
past judgmental demands and performance feed- 
back on the attributes of the knowledge struc- 
ture such as availability, level of detail, cuing and 
organization. Relating the results from cognitive 
psychology and learning literature to the case of 
the auditor, they suggest that auditor’s learning 
from experience may be deficient because 
people manifest a strong bias toward seeking evi- 
dence that will confirm rather than disconfirm a 
current belief or hypothesis. People also do very 
poorly at judging co-variation. In many audit 
situations, outcome feedback is only partial 
which may complicate the auditor’s judgment 
about co-variation. 

From the results on knowledge structure and 
content in cognitive psychology, Waller & Felix 
(1984a) propose a general schematic memory 
model of the audit opinion formulation process. 
Like Gibbins (1984), they suggest that the 
auditor arrives at work with a memory structure, 
but they go more into detail and speculate as to 
what knowledge is present and how it is or- 
ganized. They propose that the auditor has a 
number of macro-template schemata with 
which are associated a number of narrower 
template schemata and procedural schemata. 
Each of the schemata has nodes and relations. 
Associated with each node are sets of conditions 
and possible values. According to this model, in 
an audit, the auditor will recognize a pattern of 
cues that he classifies into categories. He then 
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uses his procedural schemata to test the fit of a 
template schemata to the current audit situation 
as indicated by the pattern of cues. 

In summary, it seems that, in theory at least, 
auditors acquire knowledge through experience 
and develop a knowledge structure. Expert 
auditors will then possess more knowledge and 
a “better” memory structure than novices. Their 
knowledge may, however, be biased because 
their learning from experience may be deficient. 

A major problem with these theoretical 
studies is that they try to transfer results from 


other fields to auditing. This exercise has the ad- 


vantage of introducing a new approach to audit- 
ing researchers and suggesting useful avenues 
for future research. However, most conjectures 
are not based on evidence in auditing. Also, 
although most cognitive psychologists will 
agree that experience produces a memory struc- 
ture that guides judgment, there will probably 
be some disagreement as to the form ‘of the 


memory structure. The researcher must then be | 


careful not to take all these conjectures for 
granted. As suggested by Libby (1984), those 
who want to study experts’ memory structures 
should conduct experimental tests of their 
theories or suggest testable hypotheses such as 
Gibbins (1984) did. Some auditing researchers 
have already conducted experimental tests of 
cognitive theories in auditing. These studies are 
reviewed in the next section, along with descrip- 
tive studies which used a cognitive approach. 


Empirical studies 

Empirical studies using the cognitive 
approach to study expertise in auditing have 
focused on two aspects of expertise: memory 
organization and decision process. Although 
these two topics are presented separately, they 
are interrelated. Memory organization has an in- 
fluence on the decision process. Also, memory 
structure can be inferred from the auditor’s deci- 
sion process. A summary of the results related to 
memory organization and decision process is 
presented in Table 2. 


Memory organization. As indicated in the 
previous sections, through experience, expert 
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auditors may have developed more complete 
knowledge, better cross referencing and better 
memory organization. One way to organize the 
memory is to group the information by category 
(cue clustering). Cue clustering is an efficient 
way to organize the information because it re- 
duces the information load places on the deci- 
sion maker. 

The memory organization of EDP auditors was 
examined by Weber (1980). Using a cue-recall 
method, he studied how computer controls 
were organized in the auditors memory and - 
looked at the differences in memory organiza- 
tion between experts and non-experts. For the 
measure of category clustering, he arrived at a 
value of 0.3095 for EDP auditors as compared to 
0.0941 for non-experts (students in this case). 
These results indicate that experts organize 
their memory more through the clustering of 
the computer controls by category than do non- 
experts. Within the EDP auditor group, how- 
ever, the level of experience had no effect on the 
degree of clustering. 

Libby (1985) also studied memory organiza- 


tion. He examined the memory organization of. 


financial statement errors for experienced 
auditors. He expected that financial statement 
errors would be organized by transaction cycle. 
Based on the premise that it is easier to access in- 
formation in the same category than to move to 
another category, he postulated 2 hypothesis on 
the way auditors generate hypotheses in prelimi- 
nary analytical review. His results for the direct 
test of this hypothesis supported the proposed 
organization. His results also suggest that 
auditors may further categorize financial errors 
by error type (e.g. cutoff). However, the study 
does not provide any insight on the effect of ex- 
pertise on the organization of financial state- 
ment errors. ; 

The effect of expertise on memory organiza- 
tion was examined by Frederick & Libby (1986). 
Their results indicate that, when asked for the 
consequences of control weaknesses, experi- 
enced auditors form their judgment based on 
the relations between accounts and on the per- 
ceived causal correlation between weaknesses 
and accounting errors. Novice students, having 
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less knowledge, form their judgment based only 
on the relations between accounts. 
In summary, it appears that expert auditors 


may havé more knowledge ‘and. that they or- ' 


ganize their knowledge -by category. Weber's 
study provides some. evidence that expert 
auditors tend to cluster the information more by 
category than non-expert auditors. Libby’s 
study, however, does not provide any evidence 
of the effect of expertise on category clustering. 
For future studies the use of a control group con- 
sisting of non-expert auditors (e.g. Frederick & 
` Libby, 1986) should be considered. This will 
facilitate testing to determine whether the 
` categorization or relations inferred from the ex- 
periment are idiosyncratic (Weber, 1980). Also, 


since one of the goals of cognitive research is to. 


transfer expert knowledge to novices or to de- 


velop decision aids that will help novices per- 


_ form like experts, understanding the differences 
in knowledge organization between experts and 
poe cas alent be usei 


Decision. poa These research studies pro- 
vide some evidence of the role of knowledge on 
the decision process -and may also provide evi- 


dence on memory organization. Auditor’s deci- 


sion processes were studied using (1) process 
tracing methods such as verbal protocol analysis 


(Biggs & Mock, 1983; Biggs et al., 1985; Biggs et 


al., 1986) and information retrieval techniques 
(Bédard, 1986a), (2) expert system method 


(Gal & Steinbart, 1985; Meservy, et al., 1986), 


and (3) others (Libby, 1985; Frederick & Libby, 
1986). Since all these research methods have the 
same objective, to examine decision processes, 
the results are grouped by decision process attri- 
butes. These attributes are: information search, 
hypothesis generation, decision strategy and 
action/choice. 


First, as far'as we are concerned, iaforniation® 


search, process tracing and expert-system 


methods provide a trace of the information 


sought by the subjects, the quantity of informia- 
tion acquired and the duration of the task. 
Results from the studies examined indicate that, 
in general, auditors perform a sequential and 
quite comprehensive search of the available in- 
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: formation. ' 


The three studies ies et al., 1985, 1986; 
Bédard, 1986a) that examined the expert— 
novice dimension found that the two groups of 
auditors use a sequential strategy. In contrast, 
Bouwman (1982), using a financial analysis task, 
found that novices use an undirected sequential 
strategy where the information is acquired in the 


‘order of presentation, and that experts seem to 


base their search on a standard list of questions 
or schemata that guide their search. 

The contradictory results between these two 
fields (auditing and financial analysis) might be 
attributed to the nature of the task, In auditing, 
firms have imposed structure on the audit pro- 
cess by developing methods, questionnaires and 


-decision aids (see Cushing & Loebbecke, 1986). 


Because of these aids, the structure is known 
both by novices and experts. Also, for most of 
the case material used in the auditing studies, the 
information was presented in an order consis- 
tent with firm structure. > ; 

Biggs et al. (1986) found that, compared to 
experts, novices spend more of their search 


. activities on the instructions and procedures. A 


similar result was obtained by Bédard (1986a) 
who, using a sample of.52 auditors found that 
novices spénd 40%. more time on instructions ` 
than experts. Biggs'et al. interpret this result as — 
an indication that experts have internal 
schemata that allow them to identify the type of 
problem involved and to use these schemata to 


_ solve the problem. Novices, on the other hand, 


have to devote more of their search activities to 
the instructions because of et less developed 
problem schemata. “2 

The second decision process attribute, the. 
generation of hypotheses and their evaluation, 
represents a commonly-used problem-solving 
process. According to Elstein et al. (1978), this 
process is an essential characteristic of judgment 
in complex and poorly defined situations. Since 
the audit environment is complex and poorly 
defined in many respects, it might be expected 
that hypothesis generation plays an important 
role in auditor judgment. Evidence of hypothesis 
generation has been reported in four of the cog- 
nitive studies. ` 
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Two studies indicate that hypothesis genera- 
tion is associated with past experience. Biggs et 
al. (1985) report that their subjects relied 
heavily on comparison between the current case 
and previous client situations that they have ex- 
perienced, Libby (1985) found that perceived 
error frequency and recency of experience were 
associated with the errors generated as possible 
causes of variation in financial ratios. These 
results are consistent with Gibbins’ proposition 
that experience creates a memory structure 
which provides a guide to the judgment process 
and then influences the hypotheses generated. 

Biggs & Mock (1983) found that subjects gen- 
erated alternative hypotheses.relatively early in 
the task, but that they waited until they had re- 
ceived supporting information. Except for one 
subject’s decision, the auditors did not exercise 
early closure on one hypothesis. They consi- 
dered alternatives. It is important that auditors 
consider alternative hypotheses because, in 
their study on medical decision-making, Elstein 
ét al. (1978) found that subjects who generated 
the correct hypothesis during the decision pro- 
cess made the correct final diagnosis. 

An interesting result of hypothesis generation 
is that there might be inconsistencies in the 
hypotheses generated by auditors. Meservy et al. 
(1986) found that their computer model of an 
auditor was more consistent in the hypothesis 
generated and that it never forgets to analyze any 
possibility. In contrast, depending on the audit 
case (there were three cases), some possibilities 
were not analyzed by the subjects. 

Because hypotheses generation is associated 
with past experience it might be expected that 
level of expertise will affect hypothesis genera- 
tion. But, this is a conjecture since none of the 
studies have examined the novice—expert 
dimension of the hypotheses generation prob- 
lem. : 

The third attribute, decision strategy, refers to 
the sequence of operations that auditors use for 
problem-solving. Two main decision strategies 
have been identified by Biggs & Mock (1983):a 
systemic and a directed strategy. In their study, 
the systemic strategists performed a thorough 
and sequential search of information prior to 
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making the four choices requested in the exper- 
iment. The directed strategists selected an audit 
step and then directed their search for and 
evaluation of information relevant to that audit 
step alone. The strategies seem to be related to 
experience with the more experienced subjects 
using a more systemic strategy and the less ex- 
perienced subjects a more directed strategy. 
Biggs et al. (1985) also found that their subjects 
used these two strategies. As for Biggs & Mock 
(1983), the least experienced subjects used a 
directed strategy. It is, however, difficult to 
make any conclusion on the relationship be- 
tween experience and search strategy because 
the samples are very small (four for Biggs & 
Mock and three for Biggs et al.). 

Both Meservy et al. (1986) and Bédard 
(1986a) found that auditors used a systematic 
strategy. Some of their subjects reverified a few 
pieces of information before making decisions, 
but the global strategy was mostly systemic. It is 
difficult to explain the difference in the results 
because, except for Biggs et al., the subjects 


from these four studies come from the same CPA 
firm, the decision tasks were similar (internal 


control evaluation) and, in two’ cases, the re- 
search method used'was protocol analysis. 

It seems, however, that with a total of 56 
auditors, Bédard’s (1986a) and Meservy et al.'s 
(1986) results might be more representative of 
the decision strategy of the auditors. Moreover, 
these results were substantiated by Biggs et al. 
(1986) who found that a more systemic strategy 
was used by their subjects. As for the sequential 
acquisition of information, the systemic strategy 
might depend on the structure imposed by the 
firm on the audit process. Both expert and non- 
expert auditors might have learned that firm’s 
structure and used it as a guide to their decision 
process. 

” Finally, various aspects of the quality of the 
actions and choices of the auditors were 
examined as the fourth attribute in the studies 
reviewed. As for the behavioral studies, some 
cognitive studies have reported results on con- 
sensus level. Biggs & Mock (1983) found that 
the consensus between auditors was low, while 
Biggs et al. (1983) found a high level of agree- 
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_ ment between auditors. As indicated previously, 

‘Bédard (1986a) found that experts had a lower 
level of consensus than non-experts. As for the 
behavioral studies, consensus results are con- 
tradictory. Since the issue of consensus was 
examined in depth previously, it is not discussed 
further here. 

On the expert—novice dimension, Biggs et al. 
(1986) found that experts were more selective 
than novices in modifying the planned substan- 
tive audit program based on results from analyti- 
cal review procedures. These experts appear to 
evaluate the effect of results from analytical 
review by financial statements assertion, modify- 
ing only those substantive tests verifying the af- 
fected assertions. In contrast, the novices mod- 
ified the audit work throughout the entire cycle. 
The experts’ audit strategy might then be more 
cost-effective than the novices’ strategy. , 


Discussion 

Although relatively new in auditing, the use of 
the cognitive approach to study expertise has al- 
ready provided us with a better understanding of 
what expertise in auditing is. As compared to 
novices, expert auditors have been found to 
have better knowledge (Frederick & Libby, 
1986), to organize their knowledge more by 
category (Weber, 1980), and to have a finer 
categorization (Biggs et al., 1986). Both experts 
and novices seem to perform a sequential and 
quite complete information search. The results 
conceming decision strategies are contradic- 
tory. Some studies have found that novices use a 
directed strategy while others found no differ- 
ence between the search strategies of experts 
and novices. Finally, there is some indication 
that experts might make “better” audit decisions 
than novices. But the issue of decision quality 
needs more research because this conclusion is 
based on only one study and results from medi- 
cal decision-making indicate that experts might 
not make better decisions (see, for example, 
Johnson et al., 1981). 

The cognitive approach has provided auditing 
researchers with stronger theoretical bases for 
researching expertise in auditing. Up to now, 
however, most of the studies using the cognitive 
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approach are descriptive and consequently, it is 
difficult to generalize their results. These studies 
along with cognitive theories suggest interesting 
propositions that may be tested using the 
hypothetico-deductive approach. This issue will 
be addressed in the next section of this paper. 


FUTURE RESEARCH 


Based on this review, it appears that our 
knowledge of what expertise is has increased in 


- the last ten years. This evolution seems to paral- 


lel the move from a behavioral approach to a 
cognitive approach. At the beginning, expertise 
was considered as a construct that may 
adequately be approximated by number of years 
of experience, and the focus of expertise re- 
search was on its effect on the input and output 
of the judgment process. With the cognitive ap- 
proach, the focus is more on the inside of the 
“black box”, the understanding of the expert’s 
memory and decision process. 

Our knowledge about expertise in auditing is, 
however, still embryonic, it is not known what 
expertise is. The need for research on expertise 
is acknowledged by the practitioner. For ex- 
ample, Shpilberg & Graham (1986) identify the 
need to better understand what is meant by “ex- 
pertise” as an important issue for future re- 
search. In this section, some suggestions as to 
what should be done in future research are pro- 
vided. 

A general weakness of the studies reviewed is 
the lack ofa theoretical basis regarding the effect 
of expertise on decision behavior. An explicit 
formulation of theory has the advantage of guid- 
ing the researchers and indicating what to look 
for in their analysis of the data. If the theory is 
well developed, it will be possible to specify 
hypothesis and conditions under which the 
hypothesis should be valid. This researcher be- 
lieves that knowledge about expertise will grow 
faster if there is a better theoretical formulation 
of the problem before the data collection and 
analysis phases. This does not mean that no more 
descriptive research should be performed, but 
there is room for more empirical studies with 
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‘good theoretical bases (e.g. Weber, 1980; Libby, 
1985; Frederick & Libby, 1986). 

Theories on expertise and its effect on deci- 
sion processes have been developed in other 
fields such as cognitive psychology and medical 
decision-making. Auditing researchers have also 
developed theoretical propositions related to 
expertise (e.g. Gibbins, 1984; Waller & Felix, 
1984b). Those who wish to investigate exper- 
tise in auditing may use these theories or propos- 
itions. l 

There should also be more ties between the 
studies. For the studies using the lens model, the 
comparison between studies is easy because a 
number of standard attributes are examined. For 
the studies using the cognitive approach, it is dif- 
’ ficult to make a comparison and draw general 
conclusions. The coding of the protocol’s are 
often different as are the terms used to describe 
the behavior or knowledge. Also, many studies 


are conducted in isolation from others. There is . 


a need for a standard way to code protocols so 
that it will be easier to compare study results 
(see Klersey & Mock, 1987). Also, there is room 
for a program of research on expertise such as 
the one performed by Elstein et al. (1978). In 
their five year program, they used a multi- 
method approach and a variety of experimental 
material to study the decision behavior of expert 
and non-expert physicians. 

A control group is necessary in order to deter- 
mine whether the behavior of the expert results 
from the research material, results from the ex- 
‘pertise of the auditor, or is idiosyncratic. With- 
out a control group, it is impossible to determine 
whether the observed behavior of the experts is 
common to both novices and experts or only to 
experts (Fiedler, 1982). Thus, future research 
on expertise in auditing should use a control 
group. The discussion on the definition of exper- 
tise presented at the beginning of the paper 
might be useful in identifying the experts. Some 
suggestion as to worthwhile future areas of re- 
search on expertise in auditing are now pre- 
sented, : 

Since expertise is often referred to in terms of 
knowledge, future research on expertise should 
examine expert auditor knowledge and the way 
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this knowledge changes as auditors acquire ex- 
perience in the field. A study by Gal & Steinbart 
(1986) using the expert system method pro- 
vides some preliminary results on knowledge 
change, but, more research is needed to identify 
how audit knowledge and its organization 
change as auditors acquire expertise. An in- 
teresting study which could be replicated in au- 
diting is Feltovich (1981) on the knowledge 
based components of expertise in medical diag- 
nosis. 

Further research into the way audit knowl- 
edge influences decision processes and perform- 
ance may help understand expertise in auditing. 
Libby (1985) and Frederick & Libby (1986) pro- 
vide good examples of studies that can be per- 
formed in this area. An interesting approach to 
the study of this problem would also be to per- 
form a free recall experiment (Weber, 1980) to 
determine the memory structures of experts and 
then examine their effect on problem-solving 
behavior using a process tracing method. In 
some audit situations such as fraud detection 
(see Pincus, 1985), it is possible to determine 
what is a good decision and a bad decision. Iden- 
tifying knowledge and decision behavior that 


‘lead to good and bad decisions in these situa- 


tions might help indicate components of judg- 
ment that need help. 

An important research issue seems to be the 
determination of who is an expert or what is 
expertise. Is an expert an auditor who has more 
domain-specific knowledge or is he an auditor 
who has very broad experience and auditing 
knowledge and who can deal with a very large 
range of problems like a management consul- 
tant, or are they both experts? This is an import- 
ant issue that is linked to the definition of exper- 
tise. Expertise is often defined in terms of knowl- 
edge. A person who has specialized knowledge 
in an area is an expert. For example, an auditor 
specialized in statistical sampling will be consi- 
dered as an expert; but, an expert might also be 
an auditor without specialization. When looking 
at a problem, this auditor might add a broader 
view (expertise) to the problem and help reach 
a better decision. It would be interesting to com- 


pare generalists and specialists in terms of 
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knowledge, decision process and decisions. ‘an answer to the question in the title is it de- 
In conclusion it appears that, at this moment, pends! It depends on how expertise is defined. 
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Abstract 


Recognizing the increasing interest in verbal protocol analysis in auditing research, this paper reviews the 
literature in this area. Consideration is given to the research problems investigated, the underlying theories 
appealed to, the methodological issues addressed and the overall contribution made by each of seven 
studies. Future directions for audit research using protocol techniques are then discussed prior to a 


concluding assessment of research in the area. 


Investigating the mental processes of decision. 


makers is a difficult and challenging task. Tradi- 
tionally, “black box” strategies, such as the Lens 
Model, have been employed to examine the 
stimulus—response patterns of decision makers. 
These models, however, have circumvented the 
question of what actually goes on within the 
individual. A major purpose of process tracing 
paradigms, in particular verbal protocol analysis, 
is to examine the connection between a deci- 
sion maker’s decision processes, the use of 
memory and the actual choice, decision or 
action taken. 

This paper reviews the use of verbal protocol 
analysis in auditing research. Initially, a brief dis- 
cussion of the importance of process tracing in 
audit judgment research will be presented. Fol- 
lowing this, research in auditing which has 
employed the verbal protocol paradigm will be 
examined. Research problems, theories, encod- 
ing, design and contributions for seven protocol 
studies will be evaluated. Next, future directions 
and possibilities for audit judgment research 
using protocol techniques will be suggested. 


Finally, a summary will be provided and conclu- 
sions drawn regarding the important research 
findings. 


THE IMPORTANCE OF PROCESS TRACING 
RESEARCH IN AUDIT JUDGMENT 


The current climate in Congress and among 
regulative bodies such as the SEC challenges the 
auditing profession to improve its functioning — 
to increase confidence in the audit process and 
to reduce the risk of audit failures. As Libby & 


‘Fishburn (1977) observed, understanding how 


individuals make decisions has direct relevance 
for improving decisions. However, how does 
one research an individual’s decision-making 
processes? Verbal protocol analysis is a 
methodology which can be used to provide this 
knowledge. Larcker & Lessig (1983, p. 74) indi- 
cate that, “if the research goal is understanding a 
subject’s cognitive processes, a process tracing 
procedure seems to be required”. 

The task of auditing is complex — many items 
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‘of information must be tested for validity and 
accuracy. For some audit judgments, the possi- 
bility for error may be controlled through statis- 
tical sampling and use of specific statements on 
auditing standards. What is not fully controlled 
by these measures, however, is the propensity 
for the individual to make mistakes. This natural 
inclination to err is one reason why the study of 
judgment decision-making tasks, and in particu- 
lar auditor judgment, is important. “Errors in 
judgment suggest ways in which performance 
might be improved, especially if one under- 
stands why the errors occurred” (Pitz & Sachs, 
1984, pp. 140-141). Payne (1976), for example, 
found a decrease in the proportion of informa- 
tion used as the number of attributes increased. 
In the audit context, Payne’s finding may have 
implications for both error and fraud detection. 
It is possible to use verbal protocols to trace the 
sequence of operations involved in information 
acquisition and to make inferences about the 
way the information is used. This understanding 
could help to preclude some errors by training 
auditors to attend to information which is signifi- 
cant, but which may be overlooked. 

Research has shown that not only do indivi- 
duals attend to different pieces of information’ 
but they also differ widely in the types of choice 
criteria and strategies used in the judgment deci- 
sion-making task. Lussier & Olshavsky (1974) 
attribute these differences in choice or strategy 
to different perceptions or internal representa- 
tions of the external environment while Payne 
(1976) attributes the differences to variations in 
memory storage. Understanding auditors’ 
choice criteria and strategies can provide data 
about how audit judgment is actually made and 
perhaps how it may be improved. Verbal pro- 
tocol analysis is one methodology which can 
give researchers this type of information (Biggs 
& Mock, 1983; Meservy et al., 1986). 

Understanding auditor judgment also has 
implications for sharing and extending that 
knowledge. Specifically, it may be possible to 
w teach audit judgment skills and thus share the 

‘knowledge with “new” auditors. In another con-' 
text, it has been shown that in some audit tasks 
relevant auditor knowledge can be captured in 
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an expert system (e.g. Meservy et al., 1986; Dil- 
lard & Mutchler, 1986). The expert systems 
which have been developed to date are built in 
part by acquiring the knowledge-base using ver- 
bal protocols. 

There are limits, however, to the interpreta- 
tion and use of verbal reports. “It is likely that 
verbal protocols can provide useful information 
about deliberately selected strategies, but not 
about the more automatic, intuitive processes” 
(Pitz & Sachs, 1984, p. 146). Also, “process 
tracing studies have been successful in describ- 
ing strategies, but generally not in predicting 
their use” (Klein, 1983). An understanding of 
the methodological considerations surrounding 
the use of process tracing techniques and the im- 
plications of these methodological constraints 
for audit researchers is therefore significant. 
Nesbitt & Wilson’s (1977) article on the weak- 
ness in verbal reports and Ericsson & Simon’s 
(1984) rebuttal to their arguments provide fer- 
tile ground for investigation of these questions 
in the auditing environment (Anderson, 1985; 
Boritz, 1986; Boritz et al., forthcoming). 

Finally, building a general theory of judgment 
would be of benefit since it would provide a 
benchmark for comparisons. However, “if a gen- 
eral theory is to be found, it would be helpful to 
establish a compendium of basic cognitive 
mechanisms involved in [judgment decision 
making] tasks” (Pitz & Sachs, 1984, p. 147). 
Again, verbal protocol analysis is one method 
which can provide this information. 


EVALUATION OF VERBAL PROTOCOL 
STUDIES IN AUDITING 


The discussion thus far has examined various 
aspects of the background for research using 
verbal protocol analysis. This section offers a 
critical evaluation of these efforts in auditing. 
The questions asked relative to the articles 
reviewed are as follows: 

(1) Was the research problem clearly and 
adequately defined? And, were “scientific” 
hypotheses stated, either traditionally or as 
specific research questions? 
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(2) Was underlying theory presented, if such , 
theory exists? 

(3) Were codes developed from general 
information processing theories? 

(4) Were standard research design principles 
employed? And, was the analysis appropriate to 
the study? 

(5) Did the research contribute to the fund of 
knowledge on audit judgment? 

It should be noted that all of the evaluation 
criteria are generic to any research evaluation, 
however, two are more specific to the criticism 
of protocol studies and audit judgment efforts 
([3] and [5]). In addition, it should be pointed 
out that while the entire range of work done 
using the protocol methodology is quite large, 
the number of audit specific verbal protocol 
studies is small. With this evaluation structure in 
mind, seven protocol studies is small With this 
evaluation structure in mind, seven protocol 
studies in auditing will be evaluated and com- 
pared. These studies include Biggs & Mock 
(1983) (see also Mock & Turner (1981 )), Boritz 
(1986), Dillard & Mutchler (1986), Meservy et 
al. (1986), Biggs et al. (1987), Biggs et al. 
(forthcoming) and Boritz et al. (forthcoming). 
The research projects had different objectives, 
but all were concerned with the same process — 


audit judgment. 


A SEVEN STUDY CRITIQUE 


Research questions, issues and focuses 

All seven of the studies stated specific research 
questions and issues to be investigated. The 
questions, however, were not always expressed 
explicitly, nor were they reported in the tradi- 
tional statistical manner. For example, Biggs & 
Mock (1983, p. 246) state that, “the research 
questions of interest regarding information 
acquisition were both the proportion of avail- 
able information and the specific types of infor- 
mation attended to”. Other researchers made 
similar comments regarding the questions of in- 
terest in their particular investigations. In most 
cases, research questions were embedded in the 
text as opposed to being stated separately. The 


135 


reader was typically required to hunt for or to 


| infer research questions or issues as opposed to 


finding them directly presented. 

An examination of the topics covered in 
reviewed studies indicates that a wide range of 
research issues in auditing have been investi- 
gated. In spite of this variety, however, it is possi- 
ble to categorize the areas of investigation into 
three distinct groups. First, there are studies 
which have as their principle objective the 
examination of judgmental or decision proces- 
ses. The work of Biggs & Mock (1983), Biggs et 
al. (forthcoming) and Biggs et al. (1987) gener- 
ally falls into this category. The second category 
is composed of investigations aimed at building 
knowledge-based expert systems. The research 
of Dillard & Mutchler (1986) and Meservy et al. 
(1986) constitute this group. The third collec- 
tion includes methodologically-oriented efforts. 
This section is best characterized by the work of 
Boritz (1986) and Boritz et al. (forthcoming). 
Each of these categories is examined in greater 
detail below. It should be noted that the 
categorization of these studies was based on the 
primary research objective: Studies could be 
considered under other areas since in many 
instances multiple objectives were pursued. 
However, these alternatives were not evaluated 
in this paper. 


Judgmental and decision process studies 
Biggs & Mock (1983) searched for detailed evi- 
dence of the information-processing and choice 
behavior of auditors in an internal control task. 
They focused their analysis on the identification 
of the auditor’s “problem space,” information 
acquisition, and information use. 

Biggs et al. (forthcoming) took as their pur- 
pose the production of some initial descriptive 
evidence of how auditors perform judgmental 
analytical reviews. The study conducted by 
these researchers included numerous explicitly 
stated research questions which can be grouped 
into four blocks: (1) performance; (2) cate- 
gories of behavior; (3) information acquisition; 
and (4) decision processes. It should be noted 
that some of these questions could be stated as 
hypotheses and could be subjected to experi- 
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mental and statistical testing. A sample of the 
questions asked under the various categories 
includes the following. 

(a) Performance. (1) What differences and 
similarities are there in decision performances 
of experienced and inexperienced auditors? (2) 
What effects do variations in the nature, scope 
and complexity of an analytical review task have 
on auditors’ judgments? 

(b) Categories of behavior. (1) What specific 
behaviors, in terms of operators used, were evi- 
dent in the protocols? (2) What relative propor- 
tion of categories of behavior were evident? 

(c) Information acquisition. (1) In what order 
did the auditors acquire information? (2) What 
components of the audit work papers did the 
auditors emphasize in information acquisition? 

(d) Decision processes. (1) What decisions 
did auditors make in terms of identifying audit 
problems and opportunities during analytical 
reviews? (2) What decisions did auditors make 
in terms of requests for analytical review infor- 
mation and how was it used? 

Biggs et al. (1987) asked six specific ques- 
tions as the focuses for their research. Specifi- 
cally, these queries asked about information 
acquisition, use of evaluation operators, the 
types of knowledge and reasoning processes 
employed and the accuracy and extent of deci- 
sions made. 

Each of these protocol studies asked questions 
about information acquisition, information 
evaluation and decision-making processes, 
although each examined these questions in diffe- 
rent auditing contexts. Biggs & Mock (1983) 
used an internal control environment, Biggs et 
al. (1987) examined auditing of complex EDP 
systems. In addition, each of these investigations 
also looked at some unique features of the pro- 
cesses detailed above. Biggs & Mock (1983), as 
previously indicated, specifically attempted to 
identify the auditor’s “problem space”, Biggs et 
al. (forthcoming) examined the auditor’s adjust- 
ments to information and errors in decision- 
making and Biggs et al. (1987) looked at the 
types of knowledge (domain, prototype, feature, : 
system, meta and planning) employed by the 
auditor. A general observation which can be . 
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made about each of these studies is that they 
tend to support one another (e.g. similar be- 
haviors have been found in different works). In 
addition, the studies add to the understanding of 
how auditors make judgments. 


Expert systems development studies. Specific 
research questions are not as readily apparent in 
the investigations aimed at building knowledge 
bases and expert systems. Meservy et al. (1986), 
however, did generate some post boc questions 
related to hypotheses produced by the computa- 
tional model developed in their research. By 
contrast, Dillard & Mutchler (1986) did not ask 
particular research questions but rather indi- 
cated that the focus of their study was model 
development and hypothesis generation. These 
authors indicated that their objective was “to 
develop a representation of the task and imple- 
ment it within a prototype knowledge based sys- 
tem” (Dillard & Mutchler, 1986, p. 1). In particu- 
lar the research was directed at the first stage of 
building an expert system based on decision- 
making in a going concern context. 

Meservy et al. (1986) indicated research 
objectives similar to those of Dillard & Mutchler, 
but extended their work beyond first stage 
development to the actual testing and validation 
of the expert system produced. Specifically, 
these investigators: 

(1) determined the processes that auditors 
use in a specific audit task (i.e. internal control); 

(2) formalized and implemented those pro- 
cesses as a conceptual model; and 

(3) tested the model. 

Meservy et al. (1986) provide what they call 
“questions of interest”. A sample of the questions 
presented by these auditors includes: “Are the 
hypotheses generated by the computational 
model found in the protocols of the primary 
auditor? Are the hypotheses generated by the 
computational model found in the protocols of 
other auditors for the same firm”? (p. 61). Simi- 
larly, “[a]re the reasoning processes produced 
by the model found in the protocols of the prim- 
ary auditor? Are the reasoning processes pro- 
duced by the model found in the protocols of 
other auditors for the same firm”? (p. 63). 
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Based on the questions detailed by these 
authors, it is possible to categorize the research 


areas of interest in general terms. Specifically, 


four different focuses are indicated by these 
reasearchers: (1) questions comparing the pro- 
tocols of the “primary” auditor with the compu- 
tational model; (2) questions related to the 
problem solving process; (3) questions related 
to the lines of reasoning employed; and (4) ques- 
tions related to adequacy of the model. 

The development of knowledge-based sys- 
tems premised on the decision-making behavior 
of auditors is complementary and supplemen- 
tary to the process oriented studies first detailed. 
Indeed, the initial phase in expert systems 


development is the identification of these same: 


decision strategies. The essential difference 
between these two approaches lies in the ulti- 
mate use of the protocol generated. Researchers 
focusing on expert systems development are in- 
terested in the application of knowledge about 
decision-making, while investigators with a pro- 
cess orientation are concerned about under- 
standing the process of decision-making itself. 


Methodological studies. Two recent studies 
by Boritz (1986) and Boritz et al. (forthcoming) 
investigate methodological considerations of 
protocol analysis in an auditing environment. In 
particular, response elicitation methods (think- 
aloud vs silent) were tested within a planning 
and review context. Both studies used an experi- 
mental task to investigate questions regarding 
the effect of elicitation method on auditors’ re- 
sponses. While specific research questions (the 
researcher’s priors) were not separately stated, 
hypotheses in both efforts are sharply defined 
within the context of experimental design. A 
MANOVA was employed in Boritz (1986) which 
looked at the interaction of the auditor’s posi- 
tion within the firm, the method of elicitation, 
and the order of information (i.e. cases) used on 
the actual responses provided by subjects. 
Boritz et al. (forthcoming) used an ANOVA 
design to investigate similar questions, but also 
studied other interaction effects, especially 
those related to audit risk. 

Given the interest in conducting verbal pro- 
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tocol studies, it is valuable to know what impact 
the use of the technique has on the outcomes 
produced and therefore on the ability of resear- 
chers to generalize from the results. As Boritz 
(1986, p. 346) states, “to the extent that the 
behavioral experiment is an auditing research 
‘institution,’ we must strive to understand the 
ways in which the research method affects prob- 
lem-solving and researchers’ conclusions”. If 
protocols are used as a first cut at understanding 
complex decision processes, or if the protocols 
are applied in the development of expert sys- 
tems, as long as those systems are validated in 
the manner of Meservy ef al. (1986), then the 
response biases found by Boritz and his col- 
leagues should provide no serious problems. On 
the other hand, if broad inferences are attemp- 
ted based on the protocols, care needs to be 
exercised. This caveat would be especially true 
with expert subjects when information used 
may not be verbalized by the subject. One solu- 
tion to the weakness inherent in some research 
techniques is to use a multi-method approach. 
This orientation permits the researcher, aware 
of the particular limitations for a method, to 
compensate for weaknesses and produce 
stronger results, Indeed, a weakness of focusing 
on audit studies is the possibility that a larger set - 
of research may be ignored. 


Theoretical background for verbal protocol 
research 

An important part of the protocol study, when 
possible, should be the testing and perhaps 
development of an appropriate theory. The 
“when possible” qualification to theory develop- 
ment is especially important in auditing where 
there are few (if any) theories, that are suffi- 
ciently “rich enough or comprehensive enough 
to predict or explain auditor behavior in a realis- 
tic task setting” (Biggs et al., forthcoming, p. 6). 
However, it is important to conduct research in 
audit decision-making in settings matching; as 
nearly as possible, actual audit contexts since 
laboratory settings tend to dilute the generaliza- 
bility and practical application of the study. 
Furthermore since auditors, in general, make 
good decisions (Willingham, 1985), normative 
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theories can be developed by describing how 
these decisions are made. From this theory per- 
spective, researchers utilizing the verbal pro- 
tocol methodology have attempted to: (1) hypo- 
thesize what information subjects will or will 
not use, and (2) make inferences regarding 
problem solving strategies. 


To test a hypothesis of the first type, we need to compare 
the set of statements implied by that hypothesis with the 
statements implied by competing processes. The union 
of these sets defines the totality of coding categories that 
needs to be considered in encoding the protocols in a 
manner that is both comprehensive and neutral among 
the hypotheses (Ericsson & Simon, 1984, p. 320). 


The second type of hypothesis permits the 
direct comparison of a sequence of activities or 
information use, as indicated by the subject’s 


verbalizations, with predicted actions or infor- 


_ mation use. Overall, 


verbal data often provides a powerful means for testing 
broad theoretical generalizations, which can sometimes 
be confirmed or refuted by demonstration of the 
presence or absence of certain information in subject's 
verbalizations. But the real promise of verbal data, a 
promise already partly realized over the past two or three 
decades, lies in their use in developing and testing de- 
tailed information processing models of cognition, mod- 
els that can often be formalized in computer program- 
ming languages and analyzed by computer simulation. To 
take advantage of the power of verbal data in carrying out 
this enterprise, we must develop a methodology for en- 
coding and interpreting these data (Ericsson & Simon, 
1984, p. 220). 


Typically a research hypothesis is stated in 
general terms, particularly in the early phases of 
an investigation, and later refined to a more 
specific statement. This distinction between the 
more general hypothesis, sometimes referred to 
as a “scientific” hypothesis, and the more par- 
ticular hypothesis usually called a “statistical” 
hypothesis is important to the researcher who 
employs a verbal protocol methodology. In 
essence the protocol investigation, with the 
exception of the experimental design im- 
plemented, is a descriptive study which 
examines the phenomena of man vs a concern 
for population parameters. In this sense, verbal 
protocol analysis is not as statistically rigorous, 
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nor as amenable to the statistical tests found in ai 
typical experimental effort. This particular. 
feature of the methodology provides rich infor- 
mation about decision processes, and may pro- 
vide scientifically acceptable data to test (re- 
fute) certain hypotheses. i 

The theoretical foundations for the protocol 
studies reviewed in this paper can be divided 
into three specific areas: (1) those theories 
related to decision-making behavior (e.g. 
Newell & Simon, 1972; Einhorn & Hogarth, 
1981); (2) those concerned with the knowledge 
acquisition for use in building expert systems 
(e.g. Johnson et al., 1979; Johnson, 1984); and 
(3) those dealing with the methodological ef- 
fects produced through the use of verbal pro- 
tocols (e.g. Nisbett & Wilson, 1977; Ericsson & 
Simon, 1984; Anderson, 1985). These theoreti- 
cal frameworks roughly correspond with the 
research objectives indicated in each of the 
studies examined. 


Theories about dectsion-making bebavior. 
Although many theories of decision-making 
exist, Newell & Simon’s (1972) theory of human 
problem solving and Einhorn & Hogarth’s 
(1981) framework of processes composing 
decision-making behavior are the two most 
often cited sources of support for studies about 
auditors’ behavior. The importance of this 
theoretical grounding is evident in the state- 
ment by Biggs & Mock (1983, p. 239) that, 
“Newell and Simon’s (1972) theory of human 
problem solving provided the theoretical found- 
ation for the majority of [their] data analysis”. 

Briefly, the Newell & Simon (1972) model 
posits that decision makers utilize a “problem 
space” (i.e. goals, operators and knowledge 
states) in the problem solving process. Protocol 
studies which employ this theoretical model 
(e.g. Biggs & Mock, 1983; Biggs et al., 1987; 
Biggs et al., forthcoming) are interested in de- 
scribing subjects’ decision-making behavior. 
These researchers, however, are specifically 
interested in three interrelated subprocesses 
detailed by Einhorn & Hogarth (1981): informa- 
tion acquisition, evaluation/action and feedback/ 
learning. 
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Theories about information acquisition for 
expert systems. Meservy et al. (1986) utilized 
the general procedures developed by Johnson 
(1984) for the elicitation of expert knowledge. 
Specifically, they employed three techniques: 
(1) observation; (2) description; and (3) intui- 
tion. The theory underlying their investigation 
lies in these methods. the observational method 
— verbal protocol analysis — used the Newell & 
Simon (1972) model. The descriptive method 
— interviews — sought to formalize portions of 
an experts nonverbalized knowledge. Johnson 
(1983) indicated that as experts become more 
competent, they become less able to verbalize 
their knowledge. Finally, the intuitive method — 
introspection — attempted to discover expert 
knowledge not found with the other techniques. 
One [criticism of verbal protocol technique 
focuses largely on the participant’s ability or 
inability to provide substantive verbal data re- 
garding mental processes. The earliest charge 
was that verbal protocols were nothing more 
than! the previously discredited methodology of 
introspection Cie. observations by experts of 
theit own cognitive processes). Gestalt 
psychologists critized introspection on the 
grounds of inadequacy, since many aspects of 
the cognitive process could not be reduced to 
imaginary and sensory elements. Behaviorists 
maintained that only directly observable be- 
havior could be used as data, and therefore, 
reacted against introspection. Ericsson and 
Simon quote Duncker in making the case that 
current methods are. not equivalent to intros- 


pection. 


While the introspecter makes himself as thinking the 
object of his attention, the subject who is thinking aloud 
remains immediately directed to the problem, so to 
speak allowing his activity to become verbal. When 
someone, while thinking says to himself, ‘One ought to 
see if this isn’t —, or, It would be nice ifone could show 
that —, one would hardly call this introspection 
(Ericsson & Simon, 1984, p. 60). 


Dillard & Mutchler (1986) were generally 
concerned with the audit opinion process. 
These researchers developed a theoretical task 
specification from an analysis of authoritative 


139 


pronouncements — in particular the Statements 
on Auditing Standards No. 34, the primary docu- 
ment addressing the going concern opinion. 
And, while not elaborated upon in their re- 
search, the theoretical foundations for their 
efforts were referenced to Ericsson & Simon 
(1980), Minsky (1975) and Newell & Simon 
(1972). 


Theories about methodological effects. 
Boritz (1986) and Boritz et al. (forthcoming) 
used the theories which underlie statistical 
methodology as well as those from previous re- 
search about the effect of verbal protocol on de- 
cision-making behavior (e.g. Nisbett & Wilson, 
1977; Anderson, 1985). 

One of the most often cited articles challeng- 
ing the validity of verbal protocols is that of Nis- 
bett & Wilson (1977). These authors, looking at 
retrospective protocols (ie. after-the-fact re- 
ports), state that generally verbal reports are un- 
reliable and idiosyncratic, carry no information 
that is generalizable, and provide no information 
which would further our understanding of per- 
formance. It should be pointed out, however, 
that retrospective reports are generally consi- 
dered to be less reliable than concurrent reports 
since subjects may use alternative methods to 
answer probes about previous cognitive proces- 
ses. For example, subjects may fabricate an 
answer using prior knowledge of how the task 
should be performed, or infer the general pro- 
cess from specific knowledge of a related task. It 
is worth noting that Larcker & Lessig (1984) 
when comparing linear (Lens) and retrospective 
process tracing models found the retrospective 
model to be marginally superior to the linear 
model in describing decision-making behavior. 
In any event, there seems to be a bountiful 
supply of conflicting evidence regarding just 
how valuable restrospective reports really are. 

From a more global perspective, the 
methodology of concurrent verbal protcols (i.e. 
real-time reporting) has been criticized using 
three principle arguments: (1) the effect of ver- 
balization argument; (2) the incompleteness 
argument; and (3) the epiphenomenality or ir- 
relevance argument (Ericsson & Simon, 1984). 
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The first such argument emphasizes that ver- 
balization has the effect of changing the perform- 
ance and therefore the cognitive process. Based 


on his research Boritz (1986, p. 344), for 


example, concluded that “the researcher’s 
method of eliciting responses may affect the 
actual responses provided by the subject, inter- 
acting with other factors such as subjects’ posi- 
tions, the specific type of judgment being made, 
and other contextual factors (e.g. the case under 
review y”. 

The incompleteness argument maintains that 
a considerable portion of the information 
utilized by the participant in making the deci- 
sion will not be verbalized. This argument might 
be especially relevant if the subject is an expert 
who processes information for some tasks in an 
automatic fashion. 

The third position holds that the processing is 
epiphenomenal — that is, the subject reports a 
parallel process which is independent of the 
actual thought process (Ericsson & Simon, 
1984). Indeed others have pointed out that a 
variety of systems with quite different properties 
can account for quite similar verbal reports 
(Clancy, 1984; Patel & Groen, 1986). 

Both Boritz (1986) and Boritz et al. (forth- 
coming) examined the first of the three argu- 
ments presented above. They concluded that 
the verbal protocol (think-aloud) methodology 
did have an effect on the auditors’ decision pro- 
cesses. In contrast, Ericsson & Simon (1984) 
support the position that instructions to talk or 
think aloud do not alter the sequence of cogni- 
tive processes; that concurrent (and retrospec- 
tive) reports provide a nearly complete record 
of the information sequence for a task and that 
verbal information is as valid as any other type of 
data. The research emphasis for Boritz (1986, p. 
336) was specifically on “the effects of different 
response elicitation methods (ie. silent and 
think-aloud)” on audit planning and review judg- 
ments. Citations to Anderson (1985) and Newell 
& Simon (1972) were provided as sources sup- 
porting the basic theoretical orientation of the 
investigation. i 

Boritz et al. (forthcoming) were interested in 


the planning and review judgments made by 
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_ auditors. However, they were only peripherally: 


concerned about the methodological effects of 
the verbal protocol technique. Theoretical back- 
ing for their study was provided by a review of 
numerous behavioral models described in the 
literature. Specifically, issues ranging from: 
propositions about professional judgment to the 
cognitive limitations of human information pro- 
cessing were addressed. pa 

It can be seen from this overview that there is ` 
a considerable overlap between studies in terms . 
of theoretical support. In addition, it can also be 
noted that the theories used in these studies 
were quite varied. If as some researchers main- 
tain, verbal protocol analysis is a theory building 
tool, then these variations may be of little con- 
cern. Biggs et al. (forthcoming), for example, 
build a case for the use of verbal protocols as a 
vehicle for theory development. In a similar 
fashion, Meservy et al. (1986, p. 71) indicate 
that in “the process of building and testing a 
computational model, a theory was formulated 
... further, [they indicate that] the theory can 
now be manipulated or perturbed, allowing new 
predictions to be examined”. The idea expres- 
sed by these researchers — relative to theory de- 
velopment — is consistent with the notion pre- 
sented by other scholars that building a pro- 
totype is “like” building a theory (e.g. Newell & 
Simon, 1972). 


Code development 

“Theory delimits a small portion of the uni- 
verse of potentially observable behavior as rele- 
vant” (Ericsson et al., 1984, p. 5). Theory in this 
sense serves to determine which verbalizations 
should be transcribed and how these reports 
should be coded (ie. classified in the termin- 
ology of the theoretical model). The implication 
is that codes for verbalized behavior should be 
determined @ priori vis à vis their more typical 
a posteriori development. 

Two different coding methods have been 
employed in protocol studies. The first method, 
which presumably does not require the analysis 
of meanings for the observed verbalizations, in- 
volves a prior agreement between the partici- 
pant and the researcher as to specific signals for 
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their communication. Supposedly, the experi- 
menter needs to simply categorize the verbaliza- , 


tion into one of the previously agreed groups. 
The second type of coding does require the 

interpretation of meanings. This latter approach 

is more typical in auditing and. management, 


while the former approach is more common in | 


psychology (e.g. studies using scales and 
_ multiple choice). In essence, each distinct con- 

cept, such as setting a goal, can be coded by map- 
` `ping the verbalization to the concept (Ericsson 


` etal., 1984). A hypothetical example of method ' 


two could have the subject reporting, “What I 
would like to do is complete this audit within the 
prescribed time budget”. The rater would then 
code this as “setting a goal”. 


A third type of encoding which could be , 


utilized with either of the two methods 
described above is automated/computerized 
coding. Presently, only simple verbal tasks can 
be classified using this approach. We can expect, 
however, that as the work in expert systems pro- 
ceeds, coding programs for more complex 
analyses will become available. The implementa- 
tion of a more uniform coding scheme — 


perhaps one based on the structure of the task . 
under investigation — would facilitate this - 
development (see Klersey & Mock, 1986). - 


Furthermore, the ability to code automatically 
would encourage larger sample sizes, increase 
the statistical rigor and reduce some of the criti- 
cisms forwarded against the methodology (see 
discussion below). : 

In the reviewed studies which investigated 
decision-making processes, either as their prim- 
ary focus or as an intermediate step in model 
building, the specific codes (i.e. codes of the 
second type) developed came from general in- 
formation processing theories. Biggs & Mock 
‘(1983), Biggs et al. (1985), Biggs et al. (1986), 
Dillard & Mutchler (1986) and Meservy et al. 
(1986) exhibited coding schemes which meet 
this specification. Neither Boritz (1986) nor 
Boritz et al (1986) discuss coding. Indeed the 
research objective sought in these studies was 
different enough from the other audit protocol 
investigations that no coding per se was even re- 
quired. 
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Research design and analysts 

Verbal protocol analysis is not especially con- 
ducive to the normal statistical tests because of 
the number of subjects which can be effectively 
studied given complexity and time constraints. 
And, the theories of judgment and problem solv- 
ing being investigated are, in general, not 
specific enough to provide appropriate bases for 
the development of testable hypotheses (Biggs 
et al., forthcoming). Given these conditions, 
only a limited number of statistical requirements 
are applicable to the research. A discussion of 
these elements follows. 


Sample size and external validity. One of the 
items. most often pointed to in criticizing the 
statistical validty of verbal protocol analysis is 
the small sample sizes employed. It is not un- 
usual for only 3 or 4 subjects to be investigated 
in the research. More recent efforts have used 
larger numbers of participants (10—15), but the 
sample is still limited. Since the object of the 
work is to identify the process, strategy or infor- 
mation used by the subject, however, and since 
a concern with inferences about. population 
parameters per se is not of interest, the use of 
small samples does not adversely impact the 
analysis and the inferences. 

In terms of external validity, on the other 
hand, the small sample size may be a cause for 
concern. Each protocol is individual, but the 
researcher is often interested in finding features 
common to larger classes of decision makers. 
Whether the results of a limited number of sub- 
jects can be validly generalized to the entire 
population is of some doubt. 


Experimental design and internal validity. 
“The design of an experiment is essentially a plan 
for purchasing a quantity of information which, 
like any other commodity, may be acquired at 
varying prices depending upon the manner in 
which the data are obtained” (Mendenhall, 
1984, p. 157). The purchase plan alluded to 
includes the consideration of such features as 
randomization, blocking factors and control 
groups to name a few. Most verbal protocol 
studies do not implement rigid experimental de- 
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‘sign features basically because the cost of doing 
so is generally prohibitive. The work of Boritz 
(1986) and Boritz et al. (forthcoming) is an ex- 
ception to this general finding and illustrates 
that it is possible to produce strong experimen- 
tal designs using the verbal protocol technology. 
It is possible to argue, however, that a “true” ver- 
bal protocol analysis was not employed by these 
researchers since the reports were not coded. 
The efforts of these investigators, in applying ex- 
perimental design principles, provides a conve- 
nient benchmark and suggests future directions 
for research employing this methodology. 

The criticisms of concurrent verbal protocols 
which were discussed’ earlier in this paper 
focuses largely on questions of internal validity. 
Specifically, concern for the effects of testing 
(the impact of verbalizing on performance) and 
the selection—maturation interaction (epiphen- 
omenality and incomplete verbalization) were 
noted (Campbell & Stanley, 1963). 

It is fair to speculate that other factors impact- 
ing the protocol participant also have a negative 
effect on the study’s internal validity. Maturation 
is one such element. Verbal protocols accumu- 
late a wealth of information and tend to be quite 
long, compared to typical experimental tasks, 
between the beginning and end of the process, 
therefore, it is reasonable for the subject to 
become tired, hungry and the like. These per- 
sonal reactions could have some affect on the 
results obtained, but make the task more rep- 
resentative ‘of real behavior and task effects. 
Again, it is important to remember that one of 
the purposes of a verbal protocol analysis is to 
describe the decision process or information 
search procedures used by the individual or to 
simply note what information they used; for this 
purpose, in spite of possible internal validity 
problems the methodology has value. The use of 
control groups and other experimental design 
considerations, however, would strengthen the 
- application. , 


As noted, only Boritz (1986) and Boritz et al.. 


(forthcoming) used a rigorous experimental 
design approach. Boritz (1986) implemented a 
2 X 5 X 4 factorial with randomization in inves- 
tigating the effects due.to the method of elicita- 
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tion and analyzed the data using analysis of var- 

iance (ANOVA) techniques. Boritz et ai. (forth- 

coming) also employed ANOVA to analyze the 

data, but utilized a 4 replicate of a 2 to the 7th 

factorial in 64 units as the design. The other 

studies used essentially a case study approach to 
define a task and descriptive statistics in data 

analysis. 

An important point made by Boritz et al. 
(forthcoming, p. 401), however, is that, “since the 
design is a within-subjects ANOVA design, each 
subject is his or her own control, and changes in 
judgments in response to factors under experi- 
mental control are of primary importance rather 
than the actual judgments themselves”. This is 
considerably different from process oriented 
studies such as Biggs & Mock (1983) or Meservy 
et al. (1986) where the judgments themselves 
were the most important concern. It seems 
reasonable that this difference in emphasis by 
Boritz et al. facilitates the use of traditional 
statistical experimental design techniques. Con- 
versely, the focus on decision-making processes 
by other researchers would hamper the imple- 
mentation of these design methods. The argu- 
ment being proposed by this line of reasoning is 
that the type of analysis used is conditional. It 
depends (as it should) on the object of the inves- 
tigation. In the Boritz et al. (forthcoming) study 
rigorous statistical techniques were appropriate, 
but in the research investigating decision-mak- 
ing processes, less powerful descriptive and 
graphical techniques were called for — and 
were also appropriate. 


Contribution of the research 

Each study has contributed to the fund of 
knowledge on audit judgment. These contribu- 
tions can be grouped according to the following 
scheme: exploratory findings, hypothesis gener- 
ation and testing, expert systems development 
and methodology. Tables 1—3 display these con- 
tributions according to the research objective of 
the study, the researcher(s) and the scheme in- 
dicated above. Because the number of research 
findings is quite large, the tables present only a 
synopsis of the main contributions. The greatest 
number of research contributions fall in the 
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category of exploratory findings followed in 
order by hypothesis  generation/testing, 
methodology and expert systems. 


FUTURE RESEARCH AND POSSIBILITIES 


An area for future research which has implica- 
tions for both the collection and analysis of ver- 
bal protocols is automated and computerized 
coding. If common or uniform coding schemes 
can be generated for similar tasks, particularly if 
these codes can be computerized, then the pos- 
sibilities exist for large sample sizes, lower costs 
and greater comparability between investiga- 
tions. A further implication has to do with the 
question of experimental design. If larger 
samples of verbal protocols for decision-making 
processes become available then more rigorous 
design considerations would be possible and 
appropriate. 

Biggs et al. (forthcoming) suggests as a future 
direction the investigation of types of knowl- 
edge representation of each auditor, especially 
along the experience dimension. Some steps in 
this direction have, in fact, been taken by Boritz 
et al. (forthcoming) where experience was 
defined and tested as an independent variable. In 
this regard Boritz et al. (1987, p. 117) note that, 
“experience with making the judgments in- 
volved in [their] study depended strongly upon 
subjects’ rank or position. Junior auditors 
tended to have relatively little experience with 
the required judgments. The implication of this 
_ finding is that researchers in this area need to 
draw upon relatively experienced auditors to 
permit informative conclusions to be drawn 
from their research”. 

Meservy et al. (1986, p. 71) also have some 
suggestions for future research efforts. These 
researchers propose “systematically manipulat- 
ing the model, creating and examining new 
hypotheses, and constructing specific task 
materials from which different lines of reason- 
ing, assumptions, and ‘garden paths’ resulting in 
sub-optimal solutions can be predicted”. 

Payne (1982, p. 397) suggests that, “the 
primary focus of decision research should now 
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be the search for some general principles from: 
which contingent processing would follow”. 
This research notion derives from Payne’s con- 
tention that one of the main results of years of re- 
search in decision-making is that seemingly 
minor changes in task and context cause signifi- 
cant changes in decision behavior. This has rele- 
vance in the auditing environment where busi- 
ness situations (e.g. management, financial for- 
tunes, the economy, etc.) can vary widely from 
year to year. 

One final area for future work would be the 
application of decision theories in audit judg- 
ment contexts which have to date, not received 
much attention in the literature. Specifically, 
Information Integration Theory (Anderson, 
1981) or Social Judgment Theory (Hammond et 
al., 1975) could be investigated by audit judg- 
ment researchers. 


SUMMARY 


This paper has explored the rationale and sig- 
nificance of research in audit judgment. Follow- 
ing this exploration, verbal protocol research in 
auditing was evaluated and some suggestions for 
future research presented. 

It can be concluded from this review that 
whatever the purpose of the protocol analysis, 
the technique provides important information 
helpful in understanding more about the proces- 
ses used by the subjects in making their judg- 
ments. Most often, the researcher is concerned 
about the information attended to, operators 
used, the evaluation criteria employed and the 
reasoning underlying the subjects’ decisions, 
but the variety of information gathered is rich 
enough that many other variables could be con- 
sidered. As Boritz (1986, p. 335) notes, “studies 
of audit judgment are a major focus of auditing 
research due to their potential policy implica- 
tions for enhancements to professional practice 
in areas such as development and modification 
of auditing methods, standards, and procedures, 
approaches to training and supervision, and 
creation of computer-assisted decision aids”. 

From the technical perspective, there are 
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basically three areas of concern with the verbal 
protocol technique — philosophical, statistical 
and methodological. The philosphical issues in- 
volve questions about introspection and the 
nature of data. The statistical considerations in- 
clude the difficulties associated with measure- 
ment, sampling, experimental design and infer- 
ence. And, methodological concerns include the 
effect of verbalization on the decision process 
and consequently on the value of the results ob- 


tained. Arguments have been presented on both 
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sides of these issues, but no clear conclusions 
have been reached regarding their.overall valid- 
ity. What is clear, however, is the richness of de- 
tail provided by verbal protocols about task per- 
formance, judgment and decision-making. From 
this perspective it is reasonable to conclude, that 
in spite of potential limitations, verbal protocol 
analysis is a useful tool which provides impor- 
tant insights (e.g. cues attended to and strategies 
used by auditors) into the understanding of 
audit judgment. 
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Abstract 


This field experiment (n = 137 auditors) examined the efficacy of a red flags questionnaire for assessing 
the risk of material fraud at a client. Two variables—— evaluation of a fraud case or a no-fraud case, and use 
or non-use of a red flags questionnaire —- were manipulated. Questionnaire users showed increased com- 
prehensiveness and uniformity in data acquisition. However, questionnaire use had no significant impact 
on fraud risk assessment for the no-fraud case and was dysfunctional for the fraud case, Possible reasons for 
the dysfunction are explored and suggestions for future research are presented. 


Over the past decade in the United States, the ac- 
counting profession, the users of financial state- 
ments and the government have expressed 
growing concern with the incidence of fraudu- 
lent financial reporting and the problems of de- 
tecting fraud. At the University of Kansas’ 1974 


Symposium on Auditing Problems, George Cat- . 


lett noted that in 1960 it was rare for an account- 
ing firm to have a fraud case, and rarer still for the 
fraud to be significant. However, by 1974 the in- 
crease in both fraud cases and concern over au- 
ditors’ responsibility for fraud detection had led 
to fraud becoming “a major factor in the opera- 
tion of accounting firms” (Catlett, 1975, p. 13). 
The pressure continued to build so that by 1980 
the call for more effective fraud detection was 
widespread (Romney et al., 1980). In the past 
few years, many signs have provided indications 
of the increasing importance of finding ways to 
prevent and detect fraud. These signs include 
exploding litigation and liability insurance rates, 
a new series of Congressional investigations of 
the accounting profession by the House Sub- 
committee on Oversight and Investigations, the 
Auditing Standards Board’s decision to re- 





examine the issue of auditors’ responsibility for 
fraud detection as expressed in Statements on 
Auditing Standards no. 16, and the establishment 
of a National Commission on Fraudulent Finan- 
cial Reporting. © 

. As the movement to increase auditors’ respon- 
sibility for fraud detection has gained momen- 
tum, there has been a growing interest in the use 
of red flags as potential indicators of fraud. In 
1985, Price Waterhouse (1985, pp. 19-20) 
suggested that auditors should acknowledge the 
responsibility to search for material manage- 
ment fraud and proposed expanding profes- 
sional auditing standards to create more em- 
phasis on red flags as a fraud detection tool. In 
1987, the National Commission on Fraudulent 
Financial Reporting recommended revising au- 
diting standards to reflect an affirmative obliga- 
tion to detect fraud and increased attention to 
potential indicators of fraud (National Commis- 
sion on Fraudulent Financial Reporting, 1987). 
Also in 1987, the Auditing Standards Board is- 
sued an exposure draft of a revised version of 
Statement on Auditing Standards no. 16, which 
concerns the auditor’s responsibility for errors 
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(unintentional misstatements) and irregularities 
(intentional misstatements) in audited financial 
statements. This revised standard contains a 
more affirmative emphasis on the auditors’ re- 
sponsibility to consider the possibility of mater- 
ial misstatements due to fraud. According to the 
“new standard, when the audit plan is being de- 
veloped, the auditors should make a preliminary 
assessment of the risk of material irregularities. 
The revised standard also contains considerable 
discussion of the use of red flags as indicators of 
the potential for fraudulent financial reporting 
(Auditing Standards Board, 1987). 

This paper reports on a study of the efficacy of 
red flags questionnaires, an audit tool designed 
to help auditors assess the risk of material fraud 
on ordinary audit engagements. The subjects 
were 137 in-charge (mid-level) accountants at a 
large CPA firm. The subjects evaluated the possi- 
bility of fraud at the planning stage ofan ordinary 
audit for an individual client case, based on real 
data. Approximately half the subjects used a red 
flags questionnaire, half did not. Further, approx- 
imately half the subjects assessed a case where 
the financial statements were materially mis- 
stated due to fraud and the other half assessed a 


parallel case with no material misstatements. . 


The results indicate that the use of a red flags 
questionnaire led to increased comprehensive- 
ness and uniformity in data acquisition for both 
the fraud and no-fraud experimental cases. How- 
ever, there was no significant difference in fraud 
risk assessment between questionnaire users 
and non-users for the experimental case where 
no material fraud existed. Moreover, the use of a 
questionnaire was dysfunctional for the experi- 
mental case where material fraud existed, i.e. the 
questionnaire users assessed the risk of fraud to 
be lower than did non-users. Additional explora- 
tion of the data yields some results concerning 
two potential sources of the dysfunction: failure 
to consider some relevant information cues and 
overemphasis on one set of information cues 
over another set. or 


The paper begins with a discussion of prior re- 


search on using red flags to assess the risk of 
material fraud. The research questions, hypoth- 
eses and design are then developed. Finally, the 
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results are presented and discussed. 


PRIOR RESEARCH 


The red flags approach to fraud detection 

Red flags are “potential symptoms existing 
within the company’s business environment that 
would indicate a higher risk of an intentional 
misstatement of the financial statements” (Price 
Waterhouse, 1985, p. 31). Uretsky (1980, pp. 
90—91 ) discussed the role of red flags on ordi- 
nary audit engagements: 


... auditors must be alert for signs that management’s in- 
tegrity should be viewed with additional skepticism, for 
conditions that may provide a motive for management 
fraud, and to signs that fraud has occurred. This is ac- 


complished by their perspicacity in dealing with manage- 
ment and by so-called red flags. Red flags are situational 
indicators. They indicate that the auditor should be more 
watchful than usual, and in combinations they may indi- 
cate that the auditor should be suspicious. 


Sorenson & Sorenson (1980) describe the his- 
tory of the development of the red flags ap- 
proach, beginning in the mid-1970s with 
Touche Ross’s design (in response to the Sec- 
urities and Exchange Commission’s Accounting 
Series Release no. 153) of a set of warning signals 
for fraud based on economic factors and busi- 
ness structure factors. Other accounting firms 
also created red flags lists, and some Statements 
on Auditing Standards (e.g. nos 6, 16 and 17) dis- 
cussed potential problem indicators. Albrecht et 
al. (1980), based on their literature review and 
a review of known fraud cases, identified 95 po- 
tential red flags including those related to situa- 
tional pressure (e.g. extremely rapid expansion 
through new business or product lines), oppor- 
tunity to commit fraud (e.g. a firm in which there 
are no annual vacations of executives) and per- 
sonality factors (e.g. a key executive who is ar- 
rogant or egocentric). Price Waterhouse (1985) 
also presents a partial inventory of red flags. 


Predictive ability studies 
Research on the red flags approach to date has 
concentrated on the empirical relationship be- 


tween the existence of red flags and the occurr- 
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ence or non-occurrence of fraud. Empirical 
models attempting to identify fraud vs no-fraud 
firms based solely on red flags (e.g. Sorenson et 
al., 1983, Wallace, 1983; Jones & Maher, 1987) 
have not yet established impressive predictive 
ability. These studies have examined a restricted 
set of red flags dealing with financial statement 
indicators, such as ratios which may indicate 
financial distress. However, many of the red flags 
noted in the professional literature are non- 
financial in nature (e.g. Is the company reluctant 
to provide the auditor with data needed to com- 
plete the examination? Does the company have 
unrealistic productivity measurements or ex- 
pectations?). 

Albrecht & Romney (1986) used a broader set 
of red flags in their predictive ability study. They 
sent red flags questionnaires to 20 CPA firms and 
asked the firms to complete them for a self- 
selected sample of fraud and no-fraud clients of 

-the firms. Data on 27 past fraud cases and 36 no- 
fraud cases were obtained. Of the 87 red flags on 
the questionnaire, 31 had predictive ability, 30 
did not and 26 could not be tested due to insuffi- 
cient data. 

Predictive ability for red flags will be limited 
by the nature of the approach. Users of the red 
flags approach have always acknowledged that 
while red flags are thought to be associated with 
fraud, the association is known to be imperfect. 
Specifically, red flag conditions may occur both 
in non-fraud and fraud situations, as Elliott & 
Willingham (1980, p. 8) point out: 


Red flags do not indicate the presence of fraud. They are 
conditions believed to be commonly present in events of 
fraud and they therefore suggest that concern may be 
warranted. For instance, insufficient working capital .. 
may predispose some managements to misstate financial 
statements. To an honest businessperson the same condi- 
tions would simply be a harsh fact of business life. ... 


The advantages and disadvantages of a red 
flags approach 


The purpose of a red flags approach is to in- 


crease auditors’ sensitivity to the possibility that 
fraud may exist at a client. A red flags approach is 
functional to the extent that it appropriately 
raises the auditor’s sensitivity to the possibility 
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of fraud in fraudulent situations, without creat- 
ing excessive suspicions of fraud in non-fraud 
situations. 

Ared flags questionnaire is an audit tool which 
adds structure to an auditor’s consideration of 
the risk of material fraud. Cushing & Loebbecke 
(1986) note that there has been a general move- 
ment among larger accounting firms toward 
greater overall structuring of the audit process, 
although there are still varying degrees of struc- 
ture in the approaches of different CPA firms. As 
motivations for the move to a structured ap- 
proach, Cushing and Loebbecke cite the desires 
for consistency across auditors, minimization of 
audit risks and costs, improvement of auditor— 
client communication, and distinguishable mar- 
ket images. 

One element used to structure an overall 
audit approach is the use of audit tools designed 
to assist the auditor in conducting the audit. El- 
liott & Jacobson (1987, p. 208) note that the red 
flags approach to fraud risk assessment adds to 
the structure of the audit process and call the ap- 
proach a “significant advance in audit technol- 
ogy”. Romney et al. (1980) suggested that the 
use of a red flags questionnaire could increase 
the probability of fraud detection with little in- 
cremental audit effort or cost. 

The use of a red flags questionnaire has the 
same advantages as the use of another common 
audit decision aid — the internal control ques- 
tionnaire. The gathering of information about 


factors relevant to audit decisions is facilitated, 


by making data acquisition more comprehensive 
and uniform (the same questions are considered 
regardless of which auditor performs the task). 
For both the internal control questionnaire and 
the red flags questionnaire, however, the infor- 
mation collected must still be evaluated and in- 
terpreted by the auditor (with or without the 
use of additional decision aids) as there is not at 


present a reliable empirical model for combin- 
’ ing the observed cues‘ into an optimal decision. 


The potential disadvantages of a red flags 
questionnaire approach to assessing the possibil- 
ity of fraud are again similar to the potential dis- 
advantages of an internal control questionnaire 


approach to assessing the adequacy of an inter- 
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nal control system. The red flags questionnaire 
approach, like most structured and semi-struc- 
tured data collection methods, is designed to 
make data collection more comprehensive and 
uniform. The structure of the approach drives 
data acquisition and focuses attention on the 
cues which are listed. Cushing & Loebbecke 
(1986, p. 43) suggest that reliance on question- 
naires to provide structure “could cause the au- 
ditor to fail to observe important facts, or fail to 
reason through to appropriate judgments and 
conclusions.” 

Purvis (1987) reviews the literature on struc- 
tured and semi-structured methods of data col- 
lection and concludes that the imposition of 
structure on the data collection process can 
have dysfunctional as well as functional aspects. 
A structured approach may become dysfunc- 
tional in one of two ways: (1) the structuring of 


the questionnaire may lead to an excessive em-, 


phasis on some cues over other cues (e.g. pri- 


macy or recency effects), and (2) the user may’ 
focus attention exclusively on the cues which. 


are listed in the questionnaire to the exclusion of 
other cues in the environment and thus may 
miss some relevant cues. Purvis then conducted 
a study of the impact of recording format (inter- 
nal control questionnaires vs flowcharts vs nar- 
rative memos) on the evaluation of internal ac- 
counting control Subjects who used internal 
control questionnaires had more comprehen- 
sive and less variable data sets, which indicates 
that the questionnaire did facilitate data collec- 
tion. However, there was an accompanying fai- 
lure to recognize relevant information cues not 
considered in the internal control questionnaire 
version used, whereas these cues were recog- 
nized by narrative and flowchart users. 


RESEARCH QUESTIONS AND HYPOTHESES 


_At this early stage of research exploration of 
the efficacy of the red flags approach, we do not 
yet have any empirical evidence to help provide 
answers to two important questions: 

(1) Does theuse ofa red flags questionnaire to 
consider indicators of the possibility of fraud re- 
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sult in a more comprehensive and uniform data 
set? 

(2) Is the use of a red flags questionnaire to 
consider indicators of the possibility of fraud 
functional or dysfunctional? 

This study was designed to allow testing of a 
set of hypotheses concerning comprehensive- _ 
ness, uniformity and effectiveness of red flags 
questionnaires. The hypotheses are paired to 
consider both a client situation where material 
fraud exists and a client situation where no 
material fraud exists. a 


The comprebensiveness hypotheses 

Since the goal of a structured approach is to 
facilitate data collection, the expectation would 
be that questionnaire users consider more po- 
tential fraud indicators than non-questionnaire 
users, both for the fraud case and the non-fraud 
case: 

H,. For a case where fraud exists, auditors who use a red 

flags questionnaire consider more potential indicators of 


fraud than do auditors who do not use a red flags ques- 
tionnaire. 


Hh. For a case where no material fraud exists, auditors 
who use a red flags questionnaire consider more poten- 
tial indicators of fraud than do auditors who do not use a 
red flags questionnaire. 


The untformity bypotheses 

Since the goal ofa structured approach is to 
facilitate data collection, the expectation would 
be that there is less variation in the data set con- 
sidered by questionnaire users than by non- 
questionnaire users. A red flags questionnaire 
provides the auditor with a set of potential indi- 
cators of fraud to consider. These potential indi- 
cators must be evaluated for each specific client 
situation. In a given situation, a particular indi- 
cator may provide positive (an indication no 
material fraud exists), neutral or negative (an in- 
dication potential material fraud exists) input 
into the auditor's assessment of the risk of mater- 
ial fraud at the client. Questionnaire users, then, 
will consider a mix of positive, neutral and nega- 
tive indicators. As the questionnaire approach 
drives data acquisition, there is a forced unifor- 
mity in the data sets of questionnaire users. If 


ASSESSING THE POSSIBILITY OF FRAUD 


data collection is uniform among questionnaire 
and noao-questionnaire users, the same propor- 
tion of positive, neutral and negative indicators 
should be observed by both groups. However, if 
non-questionnaire users are asked to assess the 
risk of material fraud at a client, there may be 
more variability in the types of indicators consi- 
dered. In particular, if the explicit task is to as- 
sess the risk of material fraud, non-questionnaire 
users would be expected to focus more on nega- 
tive indicators (those suggesting potential 
material fraud exists ) than positive or neutral in- 
dicators. This leads to two uniformity hypoth- 
eses to. be tested: 

H. For a case where fraud exists, auditors who do not use 

a red flags questionnaire consider a greater proportion of 


negative indicators of fraud than do auditors who use a 
red flags questionnaire. 


_ H4 For a case where no material fraud exists, auditors 
who do not use a red flags questionnaire consider a great- 
er proportion of negative indicators of fraud than do au- 
ditors who use a red flags questionnaire. 


An additional formulation of the uniformity 
hypotheses is also possible: a comparison of the 
variances in the proportion of negative indi- 
cators considered by questionnaire users and 
non-users, with an expectation that non-users 
would have a greater variance than question- 
naire users. 


The effectiveness bypotheses 

A red flags approach is functional to the extent 
that it appropriately raises the auditor's sensitiv- 
ity to the possibility of fraud in fraud situations, 
without creating excessive suspicions of fraud in 
non-fraud situations. Thus, the use of red flags 
questionnaires would be functional if question- 
naire users’ assessments of the likelihood of 
fraud at a client are both higher than non-users’ 
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assessments for cases where fraud exists and 
lower than non-users’ assessments for cases 
where no fraud exists.’ As there is no a priori 
evidence to indicate whether a structured ap- 
proach to fraud risk assessment is functional or 
dysfunctional, both alternatives must be consi- 
dered against the null: 
H, For a case where fraud exists, auditors who use a red. 
flags questionnaire assess the likelihood of fraud to be 
significantly different to the assessment of auditors who 
do not use a red flags questionnaire. (If the questionnaire 
is functional, the assessments of questionnaire users will 
be significantly bigber than those of non-users. If the 
questionnaire is dysfunctional, the assessments of ques- 
tionnaire users will be significantly Jower than those of 
non-users.) 


He- For a no-fraud case, auditors who use a red flags ques- 
tionnaire assess the likelihood of fraud to be signific- 
antly different to the asessment of auditors who do not 
use a red flags questionnaire. (If the questionnaire is func- 
tional, the assessments of questionnaire users will be sig- 
nificantly ower than those of non-users. If the question- 
naire is dysfunctional, the assessments of questionnaire 
users will be significantly bigher than those of non- 
users.) 


THE RESEARCH DESIGN 


Subjects 

The subjects were 137 auditors at the mid- 
level of a large CPA firm, with an average of 18 
months experience being in charge of field work 
for their clients. The firm uses some structured 
and semi-structured audit tools such as internal 
control questionnaires in its audit approach, but 
(consistent with current audit standards) does 
not make a specific assessment of the risk of 
material fraud on ordinary audit engagements. 

It was preferable to choose all subjects from a 
single firm to achieve as homogeneous a subject 


‘Additionally, questionnaire use could be advantageous even if mixed functional and dysfunctional results occur, provided 
that the benefits of questionnaire use exceed the costs. For example, questionnaires could still be advantageous if question- 
naire users always assess the risk of fraud to be greater than non-users, provided that the benefits of raising questionnaire 
users’ suspicions of fraud for fraud cases more than offset the costs of unduly raising questionnaire users’ suspicions of fraud 
for non-fraud cases. Or, other benefits (e.g. better documentation of items considered in fraud risk assessment or facilitation 
of training of new auditors) could offset the costs of dysfunctional fraud risk assessment by questionnaire users. The possibil- 
ity that the benefits of questionnaire use exceed the costs, despite a dysfunctional impact on fraud risk assessment, cannot 
be examined in this study as information on costs and benefits is not available. 
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‘group as possible in terms of training and other 
firm-related variables which might create differ- 
ences in fraud detection ability. However, even 


within a single firm, auditors will vary in both 


general and specific experience and may differ 
in terms of their prior expectations about the 
chance that material fraud exists at one or more 
clients their office or firm serves. Thus, back- 
ground information on general audit experi- 
ence, specific fraud detection experience and 
prior expectations about fraud was collected 
from each subject to provide a means of examin- 
ing the possibility of confounding effects of 
these variables. There were no significant differ- 
ences (at the 0.10 level) in these experience and 
prior expectations variables between question- 
naire user and non-user experimental groups 
and between fraud case and no-fraud case ex- 
perimental groups. 


Experimental task . 

The experimental task required the subjects 
to review a set of detailed background informa- 
tion for an audit client, including company his- 
tory, information about company management, 
plans and policies and a set of multiple prior 


years’ audited financial statements and current’ 


unaudited financial statements. Taking the role 
of an in-charge auditor during the planning 
phase of a continuing audit, the subjects were 
asked to assess the chance that material fraud 
exists at the client. Approximately half the sub- 
jects (n = 68) used a red flags questionnaire to 
aid them in their assessment, half (n = 69) did 
not. Additionally, two experimental cases were 
used, one being a case where the current year fi- 
nancial statements were materially misstated 
due to fraud (n = 69) and the other (n = 68) 
being a parallel no-fraud case. The question- 
naires and cases were randomly distributed. Of 
the questionnaire users, 33 received the fraud 
case and 36 the no-fraud case. For the non-users, 


35 received the fraud case and 33 the no-fraud 


casc. 
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. The fraud case was developed from an actual 
audit case where material fraud existed at a 
client. After the discovery of the fraud at the ac- 
tual client, a special fraud investigation was con- 
ducted, including an extensive audit to deter- 
mine what the financial statements would have 
shown if the fraud had not occurred. The results 
of this special fraud audit provided the informa- 
tion needed to construct a parallel no-fraud case, 
with the key difference being that the unaudited 
financial statements of the non-fraud case con- 
tain the figures which would have appeared had 
the fraud not occurred. 

The red flags questionnaire was developed 
from an initial starting point of the Romney etal. 
(1980) questionnaire. It was modified to in- 
clude additional red flags from current profes- 
sional standards and various CPA firm materials. 
Pilot testing resulted in a reduction of the num- 
ber of items on the questionnaire due to the 
elimination of redundant items. The final version 
contained 73 questions.” 

Questionnaire users were asked to examine 
the client information and evaluate each of the 
items on the questionnaire according to a seven 
point scale. The scale values ranged from “1 = 
strong indication material fraud does not exist” 
through “4 = neutral: no indication about 
whether or not material fraud exists” through “7 
= strong indication material fraud may exist”. 
On completion of the questionnaire, the sub- 
jects were asked to evaluate on a scale of 0-100, 
where 0 equals no chance and 100 equals com- 
plete certainty, the chance that material fraud 
exists at the client for the year under audit. 

Non-questionnaire users were asked to 
examine the client information and list the facts 
or impressions in the case which were relevant 
to them in assessing the chance of material fraud 
existing at the client. They were asked to 
evaluate each of their listed facts or impressions 
on the same seven point scale described above. 
Finally, non-users were asked to evaluate the 
chance of material fraud at the client, using the 
0—100 scale described above. 


*Copics of the experimental materials are available on request from the author at the University of Southern California, School 


of Accounting, Los Angeles, CA 90089-1421, U.S.A. 
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RESULTS 


Tests of the comprebensiveness bypotheses 

Table 1 provides the mean numbers of posi- 
tive (strong, moderate or weak indication no 
material fraud exists), neutral, negative (strong, 
moderate or weak indication material fraud may 
exist) and total indicators examined by the sub- 
jects, broken down into the four experimental 
‘groups. These results provide evidence, as pre- 
dicted in H, and H,, that questionnaire users 
considered a more comprehensive data set than 
non-questionnaire users. Questionnaire users 
considered significantly more total potential in- 
dicators of fraud than non-questionnaire users, 
both for the fraud case (¢ = 38.80, p =. 0.000, 
one-tailed) and the no-fraud case (t = 94.58, p = 
0.000, one-tailed). 


Tests of the uniformity hypotheses 

As can be seen in Table 2, the results provide 
evidence, as predicted in H, and Hy, that ques- 
tionnaire users considered a more uniform data 
set than non-questionnaire users. Non-question- 
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naire users considered. a significantly greater 
proportion of negative indicators (and signific- 
antly lower proportions of both positive and 
neutral indicators) than quéstionnaire users. 
These results were found both for the fraud case 
(proportion negative: f = —8.51, p 0.00, one- 


tailed) and the no-fraud case (proportion nega- 


tive: £ = —8,50, p = 0.00, one-tailed). ` 
A second test of uniformity is a comparison of 


: the variance in the proportion of negative items 


considered by questionnaire users and non- 
users. Using this alternative fest of uniformity, 
the uniformity achieved by checklist users is sig- 
nificantly greater for the fraud case, but not for 
the no-fraud case. For the fraud case, the stand- 
ard deviation in the proportion of negative indi- 
cators was 0.121 for questionnaire users and 
0.220 for non-users. The F-value for the differ- 
ence in variances was 3.31, which is significant 
at the 0.001 level. For the no-fraud case, the 
standard deviations were 0.149 for question- 
naire users and 0.183 for non-users. While the 
variance was nominally greater for non-users, 
the difference in variances for the no-fraud case 


TABLE 1. Mean number of positive, neutral and negative fraud risk 








indicators evaluated 
Red flags Non-questionnaire 
questionnaire users* users” 


Fraud No-frand Fraud No-fraud 








case case case case 

(n=33) (n=35) (n=36) (n=33) 
Positive indicators 24.12 21.69 1.03 1.67 
Neutral indicators 21.76 23.03 1.78 1.67 
Negative indicators 24.76 27.83 7.44 8.67 
Total indicatorst 70.64 7254 1025 1200 





*The differences (two-tailed t-test) between the fraud and no-fraud cases are not 
significant at the 0.10 level within the questionnaire users group. The no-question- 
naire group considered more positive indicators (t = —2.17, p = 0.033) for the 
no-fraud cause than for the fraud case. The differences between questionnaire 
users and non-users in the number of positive, neutral, and negative indicators are 
significant at the 0.000 level for both the fraud and the no-fraud case. 

+The maximum possible total for questionnaire users was 73 items. If a question- 
naire user felt there was not enough information available to evaluate an item, the 
instruction was to leave the item blank. There was no limit as to the number of 
items for non-questionnaire users. Differences in the total indicators examined by 
questionnaire users and non-users are significant, in the predicted direction, for 
both the fraud case (¢ = 38.80, p = 0.000, one-tailed) and the no-fraud cause (t = 
94.58, p = 0.000, one-tailed). 
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TABLE 2. Proportion of positive, neutral and negative fraud risk indicators 





evaluated 
Red flags Non-questionnaire 
questionnaire users* users” 


Fraud No-fraud Fraud No-fraud 








case case case case 
Proportions (1=33) (n=35) (n=36) (n=33) 
Positive indicators 0.3373 0.2998 0.0985 0.1316 
Neutral indicators 0.3109 0.3168 0.1897 0.1415 
Negative indicators 0.3518 0.3834 0.7117 0.7269 





Note: none of the differences (two-tailed t-test) between the fraud and no-fraud 

cases are significant at the 0.10 level within the questionnaire users group or 

within the no-questionnaire users group. The differences in the proportion of 

negative indicators between the questionnaire users and non-users are significant 

in the predicted direction both for the fraud case (t = —8.51, p = 0.000, one- 

tailed) and the no-fraud case (t = —8.50. p = 0.000, one-tailed). The differences 

between questionnaire users and non-users in the proportion positive and the 

proportion neutral are also significant at the 0.000 level for both the fraud and the ` 
no-fraud case (except 0.006 for % neutral, fraud case). 


was not significant at the 0.10 level (F = 1.52). 


Tests of the effectiveness hypotheses 


H, and He concerned the impact of red flags 
questionnaire use on fraud risk assessment. As 
can be seen in Table 3, the use ofa red flags ques- 
tionnaire had no significant impact on fraud risk 
assessment for the fraud case. The difference in 
the assessments offraud risk for the no-fraud 
client case between questionnaire users (34.26 


on a 0—100 scale) and non-users (38.03 ) was not 
significant at the 0.10 level. Both questionnaire 


-users and non-users appropriately assessed the 


risk of fraud to be higher for the fraud case than 
for the no-fraud case. However, the non-ques- 
tionnaire users outperformed the questionnaire 
users for the fraud case, assessing the fraud risk 
to be significantly higher than did the question- 
naire users (47.56 for the non-users, 36.21 for 
the questionnaire users; t = —1.85, p = 0.069, 
two-tailed). 


TABLE 3. Mean fraud risk assessments 





questionnaire users* 


Fraud No-fraud 
case 


Red flags Non-questionnaire 


users” 


Fraud = No-fraud 


case cas€é case 


(1=33) (n=35) (n=36) (n=33) 


Assessment of risk of material 

fraud on ascale where 0 = no chance 
100 = certainty 

(S8.D.) 


36.21 
(22.93) (21.47) (27.56) (19.26) 


34.26 47.56 38.03 





Note: For the no-fraud case, the difference in means between checklist users and 
non-users is not significant (two-tailed ¢-test) at the 0.10 level. For the fraud case, 
the difference in means between checklist users and non-users is significant (¢ = 
— 1.85, p = 0.069, two-tailed). 
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DISCUSSION 


Summary and limitations 

This study provides the first available empiri- 
cal evidence on the impact of red flags question- 
naire use on data acquisition and fraud risk as- 
sessment by practicing auditors. Consistent with 
other research on structured and semi-struc- 
tured approaches to data acquisition, the results 
of this study indicate that the subjects who used 
ared flags questionnaire to aid them in fraud risk 
assessment considered a more comprehensive 
and uniform set of potential fraud indicators 
than those subjects who did not use a question- 
naire. The increased comprehensiveness finding 
applied to both the fraud case and the no-fraud 
case examined. The increased uniformity find- 
ing was supported by two alternative tests 
(mean proportion of negative indicators, and 
variance of proportion of negative indicators) 
for the fraud case, but by only one of the tests 
(mean proportion of negative indicators) for the 
no-fraud case. The observed impact of question- 
naire use on comprehensiveness and uniformity 
of data acquisition did not lead to more effective 
fraud risk assessment. On the contrary, there was 
no significant difference in the assessed risk of 
fraud by questionnaire users and non-users for a 
no-fraud-case, and the non-users outperformed 
the questionnaire users for a fraud case. 

The use of a fraud case developed from a real 
audit client, rather than a hypothetical case, 
should increase the likelihood that the results of 
the experiment would generalize to an actual 
audit situation. However, the results should be 
interpreted with caution as, due to the time con- 
straints created by the use of a realistic audit 
case, only a single case evaluation per subject 
was obtained. Also, to allow comparisons be- 
tween the subjects, only one fraud case and one 
parallel no-fraud case were evaluated. The usual 
caveats about generalization to other subjects 
- and other cases apply. 


Implications 

The observed results are consistent with the 
data acquisition goals of more structured audit 
approaches. The red flags questionnaire users 
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considered both a more comprehensive set of 
potential fraud indicators and a more consistent 
data set than did non-users. The questionnaire 
users documented consideration of six to seven 
times as many potential warning signals as non- 


‘users, and showed significantly less variability in 
their data sets. The attainment of these goals 


alone could benefit the audit firm by making the 
results of the audit more defensible. Attainment 
of these goals, however, did not yield the addi- 
tionally desired performance results. This is not 
problematical for the non-fraud case, where 
there was no significant performance difference 
between questionnaire users and non-users. But, 
the questionnaire users suffered a perfotmance 
decrement relative to non-users for the fraud 
case. This performance decrement would have 
costs which would partially or completely offset 
the data acquisition benefits. The magnitude of 
the difference in fraud risk assessments for the 
fraud case is quite large. The non-questionnaire 
users made risk assessments almost one-third 
higher than the questionnaire users’ assess- 
ments. A difference of this magnitude is likely to 
have practical as well as statistical significance. 
In particular, a difference of this magnitude 
would likely lead to differences in audit scoping 
decisions. If the lower fraud risk assessments of 
the questionnaire users ultimately led to a lower 
fraud detection rate, the costs could be quite 
high. 

The most interesting question raised by this 
study is: given that the red flags questionnaire 
has the desired impact on data acquisition, why 
does questionnaire use not have the desired im- 
pact on effectiveness? Prior research indicates 
that if a structured data collection approach is 
dysfunctional, it may be due to the failure to 
consider all relevant cues or to an excessive em- 
phasis on one set of cues over another. The re- 


‘maining discussion reports on an exploratory 


examination of the data from this experiment. 
The results of this examination may be helpful in 
providing possible explanations for the ob- 
served dysfunction in the fraud case, which in 
turn may be helpful in suggesting avenues for 
future research on fraud detection. The ultimate 
goal of such a line of research would be to either 
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create a revised red flags questionnaire which 
would be functional for fraud risk assessment, or, 
alternatively, establish that a structured red flags 
questionnaire approach to fraud risk assessment 
is inherently ineffective. 


Potential sources of dysfunction and avenues 
Sor future research 

The first source of dysfunction considered 
was the possible failure of questionnaire users to 
consider all relevant cues. For this experiment, 
an examination of the relevant information and 
impressions listed by the non-questionnaire 
users revealed several indicators which were 
not specifically contained on the questionnaire. 
This is particularly interesting given that the 
non-users considered a significantly smaller 
total number of indicators than the checklist 
users considered. Some non-questionnaire users 
noted that the client’s public company status 
was a relevant factor; some placed emphasis on 
their impressions of the competence and 
Strength of the Board of Directors and/or the 
Audit Committee; and some gave weight to their 


impressions of the cash-management skills of the 


client. None of these items were specifically in- 
cluded on the red flags questionnaire, although 
they may have been considered by question- 
naire users in making their overall evaluation, or 
in their assessment of some of the other items on 
the questionnaire. 

The fact that the non-users did consider some 
indicators not included on the red flags ques- 
tionnaire (prepared from the most often cited 
lists of red flags) indicates that there may be ad- 
ditional warning signals of potential fraud not 
now adequately considered in the auditing liter- 
ature. Continued research, using different sub- 
jects and different actual fraud cases, into the dif- 
ferences in the fraud indicators considered by 
questionnaire users and non-users may help iso- 


late previously undocumented potential fraud 


indicators. These additional red flags might lead 
to improved effectiveness of red flags question- 
naires and/or to improved predictive ability of 


red flags in empirical studies of fraud versus no- 


fraud firms. 
The second potential source of dysfunction by 


KAREN V. PINCUS 


questionnaire users is an excessive emphasis on 
one set of information cues over another. As was 
expected, non-questionnaire users focused 
more on negative indicators than neutral or posi- 
tive indicators. Questionnaire users considered 
a more balanced set of positive, neutral and 
negative indicators. Interestingly, there was a 
significant positive correlation between the 
proportion of negative indicators considered by 
a subject and the subject’s assessment of the risk 
of fraud at the client (all subjects: r = 0.4026, p 
= 0.000). This correlation pattern also held for 
questionnaire users only (r = 0.3076, p = 
0.005) and non-questionnaire users only (r = 
0.4807, p = 0.000). 

This result raises the possibility that the red 
flags questionnaire used in this study may have 
underemphasized negative indicators. One pos- 
sible source of such underemphasis could be the 
inclusion of too many low predictive ability red 
flags on the questionnaire, which in turn could 
overwhelm those red flags with high predictive 
ability. This seems particularly possible given 
the previously described results of the Albrecht 
& Romney (1986) study in which 30 of the red 
flags on their list of 87 items did not have signifi- 
cant predictive ability for a set of actual fraud 
and no-fraud cases. Continued empirical re- 
search (using larger databases of cases) on the 
predictive ability of red flags could yield more 
insight on red flags with significant predictive 
ability. Research comparing the impact on fraud 
risk assessment of exhaustive red flags question- 
naires and shorter questionnaires limited to 
those red flags with strong predictive ability 
could then be conducted. 

The results of this study indicate that the use 
of a red flags questionnaire influenced data ac- 
quisition in the expected manner, but did not 
improve fraud risk assessment. Possible avenues 
for additional research exploration include: (1) 
studying the differences in fraud indicators con- 
sidered by checklist users and non-users in order 
to isolate previously undocumented red flags; 
(2) studying the efficacy differences between 
exhaustive red flags questionnaires and shorter 
questionnaires confined to those red flags 
shown to have strong predictive ability for previ- 
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ous actual cases. In the current auditing environ- 
ment, there are increasing pressures to expand 
the responsibility of the auditor to detect fraudu- 
lent financial reporting. Recent practitioner lit- 
erature and the recently proposed revisions to 
current auditing standards suggest an increased 
emphasis on the red flags approach to fraud de- 
tection. Questions about the efficacy of a red 
flags questionnaire for assessing the risk of fraud 
at ordinary audit clients have obvious practical 
importance. : 


Further, red flags questionnaires may be view- 


ed as an exemplar of the creation of audit tools 


or technology to add structure to the audit pro- 
cess. As this study demonstrates, structure can 
have both functional and dysfunctional aspects. 
Public accounting firms have moved in recent 


‘years toward increasing structure in the audit 


process, but there has been little theoretical or 
empirical research to date concerning the opti- 
mal amount of structuring of audit tasks. The 
study of the efficacy of red flags questionnaires 
may yield broader theoretical implications con- 
cerning the audit judgment process and the rela- 
tive advantages and disadvantages of structuring 


_ the audit process. 
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COGNITIVE HEURISTICS AND BIASES IN BEHAVIORAL AUDITING: 
REVIEW, COMMENTS AND OBSERVATIONS* 


JAMES SHANTEAU 
Kansas State University 


Abstract 


The purposes of this paper are: (1) to provide background on the heuristics and biases approach to 
decision-making; (2) to describe some of the issues being debated in psychology concerning this approach; 
(3) to review relevant studies from the behavior auditing research literature; (4) to provide comments on 
trends in the auditing literature; (5) to offer comments about the advantages and disadvantages of this . 
approach; and (6) to make some observations about future prospects for this research tradition. 


In the early 1970s Tversky & Kahneman de- 
scribed a research orientation which has domi- 
nated the judgment and decision-making litera- 
ture ever since. They argued that humans make 
use of cognitive heuristics which reduce the 
complexity of making probabilistic judgments. 
“In general, these heuristics are quite useful, but 
sometimes they lead to severe and systematic er- 
rors” (Tversky & Kahneman, 1974, p. 1124). As 
evidence for the use of heuristics, numerous de- 
monstrations were developed in which subjects’ 
behavior deviated from normative standards 
(e.g. Bayes’ theorem). Such errors or biases were 


reported for both naive students and expert sub-' 


jects (Tversky & Kahneman, 1971). In recent 
years, an extensive and often-cited literature has 
developed around heuristics and biases. 

This approach has reached a level of popular- 
ity rarely seen in psychology. There have been 
numerous accounts of the implications of the re- 
search in the press. Under the headline, “Two 
eminent psychologists disclose the mental pit- 
falls in which rational people find themselves 








when they try to arrive at logical conclusions,” 
the writer McKean (1985, p. 23) states 


“Kahneman and Tversky’s research has resulted in a 
theory that provides a systematic explanation for some of 
the most puzzling aspects of human behavior, and 
spearheaded the growth of a new discipline of science 
devoted to the behavioral aspects of decision making... 
Kahneman and Tversky’s work has begun to attract the at- 
tention of a wider audience — doctors, lawyers, 
businessmen, and politicians, who see applications for it 
in choosing therapies, devising legal arguments and cor- 
porate strategies, even conducting foreign affairs.” 


Similarly the headline from an article by Curran 
(1987) reads, “Recent psychological studies 
suggest that irrational fears cause bad buy-and- 
sell decisions. Knowing why can help you outwit 
the crowd.” The piece continues, “professional . 
money managers, whose own decisions are 
made in a high-stakes environment, have begun 
to pay attention to the researchers’ findings” (p. 
63). 

Given the prominence of this approach, it is 
not surprising to find that behavioral auditing in- 


*Many of the ideas in this paper were developed while the author was conducting research supported by Army Research 
Institute contract MDA 903-8-C-0029. Additional concepts were developed when the author was a Visiting Professor at the 
Johnson Graduate School of Management at Cornell University. A special gratitude is owed to various auditing researchers 
who heiped to educate me about the field, especially Jack Krogstad, William Wright, Robert Ashton, Ted Mock and Paul 


Harrison. 
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vestigators have become interested in cognitive 
heuristics and bases. Before turning to a consid- 
eration of relevant auditing research, some back- 
’ ground will be provided on the heuristics and 
biases approach and the on-going debate: in 
psychology about its usefulness. The paper will 
continue with comments about the current state 
of affairs and conclude with observations about 
the future of this approach. 


BACKGROUND 


Kahneman & Tversky described their view of 
heuristics and biases as follows: “In making pre- 
dictions and judgments under uncertainty, 
people do not appear to follow the calculus of 
chance or the statistical theory of prediction. In- 
stead, they rely on a limited number ofheuristics 
which sometimes yield reasonable judgments 
and sometimes lead to severe and systematic er- 
rors” (Kahneman & Tversky, 1973, p. 237). They 
then defined three cognitive heuristics for risk 
judgments: representativeness, availability and 
anchoring-and-adjustment. 


Representativeness 

Representativeness refers to making an uncer- 
tainty judgment on the basis of “the degree to 
which it is: (i) similar in essential properties to 
its parent population; and (ii) reflects the salient 
features of the process by which it is generated” 
(Kahneman & Tversky, 1972, p. 431). Support- 
ing evidence has come from reports that people 
ignore base rates, neglect sample size, overlook 
regression toward the mean and misestimate 
conjunctive probabilities (Kahneman & 
Tversky, 1973; Tversky & Kahneman, 1983). 


Avatlability ; 
Availability is used to estimate “frequency or 
. probability by the ease with which instances or 


associations could be brought to mind” (Tversky’ 


& Kahneman, 1973, p. 208). In contrast to rep- 
resentativeness which involves assessments of 


similarity or connotative distance, availability. 
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reflects assessments of associative distance 
(Tversky & Kahneman, 1973). Availability has 
been reported to be influenced by imaginability, 
familiarity and vividness, and has been sup- 
ported by evidence of stereotypic and scenario 
thinking (Kahneman & Tversky, 1979a). 


Anchoring-and-adjustment 
Anchoring-and-adjustment involves “starting 
from an initial value that is adjusted to yield the 
final answer. The initial value, or starting point, 
may be’ suggested by the formulation of the 


‘problem, or it may be the result of a partial com- 


putation. In either case, adjustments are typi- 
cally insufficient” (Tversky & Kahneman, 1974, 
p. 1128). Supporting evidence comes from 
biases in evaluation of conjunctive and disjunc- 
tive events, insufficient revision of probabilities 
relative to Bayes’ theorem (Tversky & Kahne- 
man, 1974) and framing (problem restatement) 
effects (Kahneman & Tversky, 1984). 

The purpose of this paper is to evaluate the 
role that the heuristics and biases approach has 
played in behavioral auditing research on proba- 
bility and risk judgments. Space does not allow 
consideration, however, of the other contribu- 
tions of Kahneman & Tversky, such as prospect 
theory (Kahneman & Tversky, 1979b), which 
involve utility or value judgments. 


PSYCHOLOGICAL CRITICISMS 


Given the popularity of this approach, it may 
be surprising for some to learn of the extent of 
criticism offered in the psychological literature 
of the heuristics and biases research. Many of the 
major investigators in the judgment and deci- 
sion-making field have written negatively about 
this work. For instance, consider the following 
partial list of critics: Abelson, Anderson, Beach, 
Cohen, Edwards, Einhorn, Hammond, Hogarth, 
Humphreys, Jungermann, Manis, March, Phillips, 
Christensen-Szalanski, Wallsten, Winkler and G. 
Wright. Even supporters of this aproach (e.g. Fis- 
chhoff et al, 1979) have occasionally offered 
negative comments. 


COGNITIVE HEURISTICS AND BIASES 


The criticisms have generally taken one of five 
forms. First, it has been suggested that heuristic 
strategies may produce cost-effective decisions 
in certain contexts. Hogarth (1981, p. 197), for 
example, states “the more serious criticism is the 
failure to specify conditions under which people 
do or do not perform well” He goes on to note 
that “insufficient attention has been paid to the 
effects of feedback between organism and envi- 
ronment.” Therefore, Hogarth (also see 
Thorngate, 1980) argues that heuristics may be 
adaptive, especially in dynamic situations. 

The second type of criticism questions the 
generality of empirical results. Wallsten (1983) 
states that “the generalization that subjects 
judge according to representativeness... is 
clearly overstated.” Similarly, Fischhoff et al 
(1979, p. 339) conclude that “people know or 
can figure out somewhat more than what they 
have been given credit for in the past.” Thus, 
there may have been a tendency to overstate the 
generality of judgmental biases. 

The third criticism concerns the relevance of 
heuristics and biases in the real world. Edwards 
& von Winterfeldt (1986, p. 679) argue: “The 


whole issue of how good human intuitive per- ` 


formance is may be more or less irrelevant to the 


broader question of human intellectual compe- 


tence, because if the problem is important and 
the tools are available people will use them and 
thus get right answers.” They point out that ex- 
perimenters get the correct answers using the 
very tools denied subjects. 

The fourth type of criticism reflects a selec- 
tion bias of researchers to cite evidence of 
judgmental errors and to ignore research report- 
ing appropriate behavior. As stated by Christen- 
sen-Szalanski & Beach (1984, p. 75), “It is our 
hypothesis that the widely held belief in the 
hopelessness of human judgment and decision 
performance results less from evidence to that 
effect than from the fact that only evidence to 
that effect gets much attention.” 
` The final category of criticism questions the 
logic of the heuristics and biases approach. “Re- 
search on judgment and decision making has 
been driven too much by a concern for errors re- 
lative to a normative standard the validity of 
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which one can doubt with good arguments... A , 
final lesson to be learned from the debate might ` 
be that one should avoid the term rationality in 
psychology at all” Jungermann, 1983, p. 639— 
640). Cohen (1981) arrives at a similar conclu- 
sion. i l 

Based on such criticisms, Anderson (1987, p. 
6) offers the following comment: “It is now 
widely recognized that the three heuristics of 
representativeness, availability, and anchoring 
and adjustment... are seriously incorrect... It is 
also becoming clear that the study of these 
heuristics has been pretty much a sterile blind 
alley.” 

In apparent response to their critics, Kahne- 
man & Tversky (1982a, p. 494) have offered sev- 
eral defenses of their approach and its emphasis 
on judgmental errors: 


_ “There are three related reasons for the focus on systema- 
tic errors and inferential biases in the study of reasoning. 
First, they expose some of our intellectual limitations and 


suggest ways to improve the quality of our thinking. 
Second, errors and biases often reveal the psychological 
processes that govern judgment and inference. Third, 
mistakes and fallacies help the mapping of human intui- 
tions by indicating which principles of statistics or logic 
are non-intuitive.” é 


In 1983 Tversky & Kahneman went on to state 
that “the focus on bias and illusion is a research 
strategy that exploits human error, although it 
neither assumes nor entails that people are per- 
ceptually or cognitively inept” (p. 313). 

Others have also offered support for this ap- 
proach: “this study provides strong evidence 
that previous laboratory research on decision 
heuristics and biases is applicable to the ‘real 
world’, information-rich, interactive estimation 
and decision contexts” (Northcraft & Neale, 
1987, p. 96). In addition, Taylor (1982) points 
out that the Kahneman & Tversky approach has 
had a great impact on social psychology. 

As these comments illustrate, there has been 
considerable disagreement between the suppor- 
ters and critics of the heuristics and biases re- 
search. Most outside the judgment and decision- 
making area of psychology, however, seem una- 
ware of the extent of this disagreement. 
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BEHAVIORAL AUDITING STUDIES 


’ By my rough count, there have been at least 
40 studies of heuristics and biases in the auditing 
literature. Most of these have investigated rep- 
resentativeness and related risk phenomena, 
with fewer studies of availability or anchoring- 
and-adjustment. Interestingly, there are over 20 
papers which provide reviews or commentaries 
of this work. It is obvious that heuristics and 
biases have generated considerable interest in 
behavioral auditing. 

Rather than reviewing all these papers, three 
recent studies of heuristics and biases will be 
considered briefly as examples of the approach. 
Then more global comments will be offered 
about the research in general - 

In a study of representativeness, Holt (1987) 
reexamined the conclusions of Joyce & Biddle 
(1981a) about auditors’ use of base-rate infor- 
mation in judgments of management fraud. Base 
rates, because they are unrepresentative, are fre- 
quently underweighted relative to case specific 
evidence (Kahneman & Tversky, 1973). Joyce & 
Biddle found ‘that auditors underutilized base 
rates compared to normative (Bayesian) stand- 
ards, but that auditors did better than student 
subjects. In a series of five experiments, Holt re- 
ported that it was the wording of the problems, 
rather than any innate or learned ability, that led 
to the Joyce & Biddle results. She interpreted 
these findings as evidence of a framing effect 
(see below). 

The availability heuristic was examined in a 
study of analytic review by Libby (1985 ). Profes- 
sional auditors were asked to generate hypoth- 
eses which might account for material financial 
statement errors. Auditors were also asked about 
perceived frequency and recency of their ex- 
perience with various types of errors. The 
results suggested that “perceived error frequen- 
cies play a major role in the accessibility of error 
hypotheses in analytical review” (p. 663). The 
data also revealed a positive relation between re- 
cency of experience and generation of hypoth- 
eses. The findings were inconsistent, however, 
concerning effects of memory structure on ac- 

‘cessibility. 


` JAMES SHANTEAU 


Joyce & Biddle (1981b) investigated the ef- 
fects of anchoring-and-adjustment on probabilis- 
tic inferences in auditing judgment. They con- 
ducted six experiments to determine the extent 
of effects on practicing auditors’ judgments. The 
authors conclude, “The results of these experi- 
ments indicate that auditors sometimes make 
judgments that are in violation of normative 
principles of decision making, but that these vio- 
lations cannot always be accounted for the an- 
choring and adjustment heuristic” (p. 141). An 
extension of this research by Wright & Anderson 
(1982) did find that anchoring effects were 
robust. l 

These studies typify the somewhat confusing 
state of affairs about the role of heuristics and 
biases in auditing judgment. As Ashton (1983, p. 
34) concludes, “the research on heuristics and 
biases in audit decision-making has been some- 
what limited and the results have been mixed.” 
In the remainder of this paper, I will offer several 
comments and observations which may be of 
some value in reducing the confusion. 


COMMENTS ON BEHAVIORAL AUDITING 
RESEARCH 


Based on reading through many of the papers 
on heuristics and biases in behavioral auditing, 
four comments seem appropriate. First, account- 
ing researchers have frequently had difficulty 
translating the Kahneman & Tversky demonstra- 
tions into an auditing framework. As Biddle & 
Joyce (1982, p. 187) commented, “our failure to 
observe behavior consistent with availability 
may be due, at least in part, to problems with our 
experiment.” These authors, in devising a task to 
test for anchoring-and-adjustment, had selected 
a problem that was unrelated to risk, probability 
or frequency judgments — the domains for 
which the heuristics were developed and 
evaluated. 

A related problem can be seen in the research 
by Libby (1985) intended to study availability 
(described above). The experiment in fact man- 
ipulated accessibility, not availability. These two 
concepts generally. are differentiated by 
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psychologists. Libby apparently recognized the 
distinction and is careful to use the term “acces- 
sibility” throughout the paper. Nonetheless, this 
illustrates how difficult it can be to investigate 
heuristics using auditing stimuli. 

Corresponding difficulties with the word 
problems used to test heuristics and biases have 
been reported by psychological researchers. 
Evans & Dusoir (1977, p. 130), for instance, 
argue that “the construction of their (Kahneman 
& Tversky) problems seems unnecessarily com- 
plex;” they go on to show how simplifying the 
wording can almost eliminate a representative- 
ness effect. Bar-Hillel (1979) reported a similar 
finding. This difficulty in devising appropriate 
task examples suggests that heuristics and biases 
may have limited applicability in real-world 
auditing contexts. 

Second, the results reported in many auditing 
studies of heuristics and biases are often close to 
normative (as seen for the Joyce & Biddle 
studies described above). Consider the follow- 
ing examples: Gibbins (1977) found that about 
40% of auditors’ respones were predicted by 
representativeness, about half made the norma- 
tive response and the remainder were inconsis- 
tenet with either. Bamber (1982) found that au- 
ditor managers were not only sensitive to the re- 
liability of information, they overcompensated. 
Kinney and Uecker (1982) observed results 
contrary to anchoring-and-adjustment in two 
experiments. Biddle & Joyce (1982) failed to 
find the effects predicted by the availability 
heuristic. Similar results are reported by Abdal- 
mohammadi & Wright (1982), Shields et al 
(1987), Tomassini et al. (1982) and Waller & 
Felix (1987). 

Even for studies which report findings consis- 
tent with heuristics and biases (e.g. Uecker & 
Kinney, 1977), the effects are generally smaller 
than those found by Kahneman & Tversky. Bid- 
dle & Joyce (1979, p. 31) suggest that “this 
superior performance by auditors may be at- 
tributable, in part, to their acquisition of profes- 
sional skills in evaluating sample evidence and to 
their familiarity with the decision settings por- 
trayed in the experiments.” Similarly, Tomassini 
et al (1982) note that auditors are trained to 
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evaluate sample evidence as part of their profes- 
sional responsibilities. Whatever the reason, if 
the heuristics and biases research had started 
out using auditors as subjects (instead of intro- 
ductory psychology students), it is doubtful 
whether the small effects observed would have 
generated much interest. 

Third, despite these generally inconclusive 
results, many auditing researchers nevertheless 
have argued for the heuristic-and-bias approach. 
When Biddle & Joyce (1982) failed to support 
the anchoring-and-adjustment heuristic, they 
concluded that “some as yet unidentified heuris- 
tic was at work” (p. 189). The possibility that the 
heuristics approach may have been inapprop- 
riate apparently wasn’t considered. It’s rather 
curious when failure to support a hypothesis 
only strengthens the resolve to find support. 
Similar arguments against the Biddle & Joyce 
conclusions were offered by Holstrum (1982) 
and Lewis (1982). 

There seems to be a tendency in behavioral 
auditing research on heuristics to define success 
or failure of a study by whether biases are ob- 


- served or not. Normatively appropriate behavior 


does not get the attention that normatively inap- 
propriate behavior gets. For example, Ashton 
(1983, p. 35) concludes “despite the mixed 
nature of the overall results in the heuristics 
area, findings such as these suggest that auditors 
often have difficulty understanding the implica- 
tions of sample information.” As noted above, 
this same “bias” toward emphasizing poor deci- 
sion behavior has been reported in the psycho- 
logical literature (Christensen-Szalanski & 
Beach, 1984). 

Fourth, there has been a trend in recent audit- 
ing studies to cite framing effects to account for 
the inconsistent results on heuristics and biases. 
Framing was defined by Kahneman & Tversky 
(1984) as a cognitive perspective elicited by 
task characteristics. They present as an example 
the framing of gambles in terms of status quo or 
in terms of initial wealth. Empirical effects are 
demonstrated by showing that subjects respond 
differently to logically equivalent but restated 
versions of the same problems. Auditing judg- 
ments, by this view, may depend on problem 
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characteristics which are irrelevant to the deci- 
sion itself. This could account for why biases 
were observed in some studies and not others 
(see the Holt study discussed above). 

Framing differences, in fact, may be able to ac- 
count for some of the apparent inconsistencies 
between studies. There is, however, a tendency 
in auditing research to label any and all context 
effects as “framing” (Demski & Swieringa, 1981). 
Context effects have a long history of psychol- 
ogy, going back to the Gestalt approach to per- 
ception (e.g. Wertheimer, 1938). Context ef- 
fects were later quantitized in Helson’s (1964) 
Adaptation Level theory into three components: 
(1) stimulus factors, (2) background factors, and 
(3) personality factors. Framing appears to be 
concerned primarily with stimulus restatements 
e.g. is the glass half full or half empty? However, 
all three of Helson’s components are likely to 
have an important impact on auditing judgment 
and deserve separate consideration. 

In sum, it’s relevant to ask about the contribu- 
tion of the heuristics and biases’ approach in in- 
creasing understanding of audit judgment. Mum- 
power (1978) suggests three questions that be- 
havioral accounting studies might address: (1) 
What task variables influence accounting judg- 
ment? (2) What individual difference variables 
are important? and (3) How do task and indi- 

- vidual difference variables interact? Added to 

. the list might be: (4) How can accounting judg- 
ment be improved? The heuristics and biases re- 
search has provided some answers to the first 
question, but has not addressed the other ques- 
tions. 


TYPES OF BEHAVIORAL AUDITING RESEARCH 


The study of heuristics and biases appears to 
have been of limited relevance for behavioral au- 
diting research for several reasons. First, the 
results have failed to reveal any consistent ef- 
fects attributable to heuristics and_ biases. 
Second, only a narrow range of auditing tasks 
have been used in heuristics and biases research. 
Third, it’s not clear that heuristics and biases are 
connected to central issues in behavioral audit- 
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ing. The latter two comments will be elaborated 
on in this section. 

As a psychologist looking at the field, there ap- 
pear to have been three types of behavioral au- 
diting studies (see Shanteau (1987) for a more 
complete discussion ). These are as follows. 


Replication studies 

The first type of project is a areplication ofa 
previously conducted behavioral study, only 
with auditors rather than introductory students 
as subjects. The methods and procedures are 
borrowed in total. The major research question 
is: Will the original findings replicate with au- 
ditors as subjects? For the most part, behavioral 
auditing studies of heuristics and biases fall into 
this category. They offer little advance in 
methodology, analysis, theory, etc., over the 
original Kahneman & Tversky studies. 

One positive feature of replication studies is 
that they have introduced many auditing inves- 


` tigators to behavioral research. On the negative 


side, however, replication studies are limited in 
two important ways. First, they investigate issues 
which originate with non-auditors and may be of 
questionable relevance to auditing. Second, re- 
plication studies ask auditing subjects to answer 
questions which may have little relationship to 
their professional skills and knowledge. 


Adaptation studies 

The second type of auditing study looks at a 
research problem originating from accounting 
and/or auditing concepts, but using methods 
adapted from behavioral research approaches. 
One example involves analysis of sunk cost ef- 
fects (e.g. Thaler, 1980; Arkes & Blumer, 1985). 
The topic is of direct concern in accounting and 
auditing, but the methods and analyses reflect 
procedures used in heuristics and biases re- 
search. 

Adaptation studies are obviously an advance 
over replication research, since the research’ 
problems originate from accounting and/or au- 
diting. However, behavioral methods may be in- 
sufficient to investigate many complex auditing 
issues, e.g. the effects of a new auditing policy. 
Instead, it may be necessary to combine be- 
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havioral and non-behavioral methods in novel 
ways to investigate such issues. 


Problem-driven studies 

The third type of project involves research de- 
Signed uniquely around the concerns of be- 
havioral auditing. Such studies lead to their own 
methods and procedures; in contrast, the first 
two types of studies are largely spin-offs from be- 
havioral research. Thus, the methods and proce- 
dures flow from important auditing problems, 
not the other way around. 

This type of research marks the dividing line 
as far as a non-auditor is concerned — as a 
psychologist, I am no longer qualified to com- 
ment on specific projects. I firmly believe, how- 
ever, that this is the direction in which be- 
havioral research in auditing should head. 

In summary, behavioral auditing research on 
heuristics and biases falls primarily into the re- 
plication category; such research can be viewed 
as a transition stage. Adaptation studies may 
apply some of the methods from heuristics and 
biases research to accounting and/or auditing 
problems; this is clearly an improvement over 
replication research. Finally, problem-driven 
studies represent the future of behavioral audit- 
ing research; it’s not clear, however, that heuris- 
tics and biases will play any role in that future. 


WHERE'S THE THEORY? 


A number of judgment and decision-making 
researchers have criticized the heuristics and 
biases research for the absence of theoretical un- 
derpinings (e.g. Slovic et al., 1977; Jungermann, 
1983; Anderson, 1987). As Wallsten (1983, p. 
13) observes: 


“We have now reached the point where it is necessary to 
develop theories of problem representation and of judg- 
ment... The research on heuristics should rely less on in- 
dividual word problems, and more on the systematic 
manipulation of features in a manner determined by the 
theory under consideration.” 


It is troubling that after more than 15 years of re- 
search on heuristics and biases, there is still no 
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general theory or even specific models of under- 
lying processes. 

This same concern applies to behavioral audit- 
ing studies — there doesn’t appear to have been 
much progress on theory development related 
to heuristics and biases. Presently, there are 
many borrowed concepts in behavioral auditing 
research, but little in the way of original 
theories. Although psychology and other be- 
havioral sciences can provide methodologies for 
answering questions about auditing, they cannot 
identify which auditing problems are theoreti- 
cally important to pursue. 

Although the absence of theory has been com- 
mented on by auditing researchers (Swieringa & 
Weick, 1982; Gibbins, 1984), there is a tendency 
to look to psychology for the answer. For in- 
stance, Biddle & Joyce (1982, p. 190) conclude 
that efforts to improve audit decision-making 
“are likely to be impeded until the psychological 
theory of decision making is better formulated.” 
This may be true, but audit researchers should 
also be looking to develop their own theories. 

The lack of theoretical progress is troubling, 
not only at the scientific level, but also at a prac- 
tical level. Practitioners, no less than basic re- 
searchers, want answers to such theoretically- 
based questions as how can judgments be made 
with greater accuracy and what can be done 
about systematic errors? It will be up to audit re- 
searchers to develop interesting theories and 
models. 

What is the theoretical status of cognitive illu- 
sions, heuristics and biases in behavioral ac- 
counting and auditing research? Let me address 
each of these concepts in turn. 


COGNITIVE ILLUSIONS 
Frequently, an analogy is made betwen per- 


ceptual illusions and the biases resulting from 
the use of heuristics. In both cases, “errors and 


‘biases often reveal the psychological processes 


and the heuristic procedures that govern judg- 
ment and inference” (Kahneman & Tversky, 
1982b, p. 124). In parallel to a sensory-based 
perceptual illusions, judgment errors are often 
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labeled “cognitive illusions” (Tversky & Kahne- 
man, 1983). 

There is, however, a serious problem with this 
analogy. In the case of a perceptual illusion, the 
subject is directly exposed to but misperceives a 
stimulus object. The size of the illusion can be 
measured by comparing the subject’s response 
with the actual stimulus value. With cognitive il- 
lusions, on the other hand, the subject is never 
actually exposed to the correct value. Instead, 
the correct answer is derived from normative 
considerations, such as Bayes’ theorem. Since 
subjects are asked to make judgments about 
things they have not actually experienced (most 
of us don’t have a Bayesian calculator in our 
head), it shouldn’t be surprising that responses 


turn out to be inaccurate. But unlike the study of’ 


perceptual illusions, such inaccuracies have not 
been shown to have any necessary connection 
to psychological mechanisms. Therefore, it 
seems somewhat tenuous to offer psychological 
interpretations of cognitive illusions when the 
basis of these “illusions”. has yet to be estab- 
lished. 

This argument was first offered by Shanteau 
(1978) who concluded that it would be more 
convincing if subjects who had experienced the 
relevant events were used. For such subjects, 
there would then be some basis for comparing 
their judgments to a perceived standard. 

. This suggestion was later investigated by 

Christensen-Szalanski et al, (1983). They found 
that experienced physicians were substantially 
less influenced by availability than college stu- 
dents in making mortality estimates. Results 
from Lichtenstein et al (1978, experiment 2) 
also showed that greater experience led to de- 
creased errors. The extent of the “illusion” thus 
depends on experience; those that have experi- 
ence with the stimuli don’t show the illusion and 
those that don’t have experience do show the il- 
lusion. Rather than an illusion, this pattern of 
results suggest that the observed errors arise 


from ignorance, not.an illusion-like process of ` 


misperception. 

A different argument can be made from the au- 
diting study of Biddle & Joyce (1979). When 
they used tasks familiar to auditors, they found 
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less evidence of a representativeness bias. They 
reported “contingent processing of information; 
viz., assess sampling error using a representative- 
ness heuristic based on the sampling fraction un- 
less the sample sizes are significantly different” 
(p. 17). This suggests a hierarchical decision 
strategy under which the use or non-use of the 
heuristic is under the auditor’s control — hardly 
the description of an “illusion”, (Consideration 
of contingent processing strategies has been re- 
ceiving increasing emphasis in the psychological 
literature (Payne, 1982; Beach et al., 1986; 
Tversky et al., 1988). 

In short, the use of perception illusions as an 
analogy of the “cognitive illusions” of heuristics 
and biases appears unjustified from both logical 
and empirical perspectives. Subjects may be 
making errors, but that doesn’t mean that the er- 
rors are the result of an “illusion.” 


BIASES 


There is little doubt that judgment biases can 
be demonstrated in the judgments of under- 
graduate psychology students. Although there 
has been considerable debate about the size of 
some of these biases (Carroll & Siegler, 1977; 
Manis et al., 1980; Beyth-Marom & Arkes, 1983; 
Christensen-Szalanski & Beach, 1983; Wright, 
1984), replicability is not the central issue in my 
view. The original examples of Kahneman & 
Tversky are easily replicated in classroom set- 
tings and in fact provide nice teaching material. 

A major question, however, is whether the 
biases observed with naive subjects also apply to 
experts. According to Tversky & Kahneman 
(1974, p. 1130). “The reliance on heuristics and 
the prevalence of biases are not restricted to 
laymen. Experienced researchers are also prone 
to the same biases — when they think intui- 
tively.” They go on to say, “Although the statisti- 
cally sophisticated avoid elementary errors, 
such as the gambler’s fallacy, their intuitive judg- 
ments are liable to similar fallacies in more intri- 
cate and less transparent problems.” 

Most of the research evaluating biases in ex- 
perts has been been conducted within medicine 
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and. auditing. In medical decision-making, there 
have been numerous studies of whether doctors’ 
judgments are biased. In their recent book, 
Schwartz & Griffin (1986) cite over 20 relevant 
papers. In the majority of these studies, the 
biases were smaller (or nonexistent) than the 
results observed for naive subjects’ (e.g. 
Wallsten, 1981). Schwartz & Griffin conclude 
that it is not clear which factors determine when 
biases will appear in expert medical judgment. 

In behavioral auditing, there have been re- 
ports of both biased and non-biased behavior. As 
noted previously, it is difficult to see many con- 
sistencies in the pattern of results. One trend 
does emerge, however. Generally, auditors are 
less biased in their judgments than naive sub- 
jects. For instance, Shields et al. 1987, p. 384) 
conclude that their results are “generally consis- 
tent with previous research indicating that au- 
ditors’ judgments are less prone to biases than 
most subjects in psychologial experiments.” 
Ashton (1982, 1983) arrived at similar conclu- 
sions. 

The primary evidence of biases in experts 
comes from Tversky & Kahneman’s (1971) sur- 


- vey of psychologists at two meetings. The results 


revealed a “prevalence of the belief in the Law of 
Small Numbers... Apparently, acquaintance with 
formal logic and with probability theory does 
not extinguish erroneous intuitions” (p. 109). 
They go on to explain this result in terms of rep- 
resentativeness. This research has been widely 
cited as showing that experts are biased in their 
professional judgments (Slovic, ae Slovic et 
al., 1985). : 

A former student of mine (Bien 1972) at- 
tempted to replicate this stady using profes- 
sional statisticians as subjects: He found that 
statisticians were less biased than the 
psychologists in the original study. More impor- 


tantly, several of the statisticians disagreed with 


the “correct answers” given by Tversky & 
Kahneman. There was even’ disagreement '. 
among the statisticians about the. appropriate ` 


answers. Apparently, the problems used in the. 


Law of Small Numbers papèr were ambiguous 
enough to allow for multiple interpretations and 
hence multiple solutions. If so, that is hardly a 
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convincing basis for concluding that experts are 


- biased in the same ways as naive subjects. 


There is a growing debate on the question of 
whether appropriate normative standards have 
been used to define biases. The definition of base 
rate, for instance, depends on the population 
from which the sampile is drawn — a given sam- 


: ple might have come from many populations 


(Cohen, 1981). Since experts are more likely to 
be aware of these alternative populations, it 
should not be surprising that they might disa- 
gree with the designated correct answer. But if 
the definition of the normative standard is un- 
certain, then the identification of a “bias” is 
equally uncertain. Thus, it is not clear that ex- 
perts exhibit the sorts of biases so easily de- 
monstrated with naive subjects. 
- HEURISTICS 

The concept of heuristics was introduced by 
Simon (1957) in his discussion of “limited ra- 
tionality.” He arguéd that, because of cognitive 
limitations, humans have little option but to con- 
‘struct simplified models of the world. Heuristics 
are a product of these simplified models and pro- | 
vide shortcuts that can produce decisions effi- 
ciently and effectively. Simon saw heuristics as 
adaptive strategies used by humans to cope with 
their limited information processing capacity. As , 
an example, Simon identified satisficing (select 
the first available option that meets minimal 
standards) as a strategy commonly used'in com- 
plex decision situations. 

. As described by Howell & Dipboye (1986, pp. ` 
390—391 ), “In the 1970's, Simon’s original argu- 
ment was rediscovered and given additional im- 
petus through a series of studies by Kahneman 
and Tversky. What they.showed was, in essence, 
that behavior often does not even approximate 
normatively optimal rules. Rather, people seem 
to-rely on handy rules of thumb or ‘heuristics’.” 
Therefore, the presence: of biases was used to _ 

‘infer the. existence of heuristics. , - : 
` By connecting heuristics to biases, Kahneman 
& Tversky took a different approach than Simon. 

Several authors have commented on the uncer-, 


os 
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tainty of this connection: “Reification of biases is 
logically strange. If the normative model is cog- 
nitively invalid, deviations from that model can- 
not have cognitive significance” (Anderson, 
1987, p. 1). Anderson continues 


“The study of heuristics, as observed by Kahneman and 
Tversky (1982), was characterized by the study oferrors. 
But errors of judgment, being definable only relative to 
some normative standard of correctness, cannot provide 
an adequate basis for cognitive theory. The study of 
heuristics, for the same reason, cannot provide an 
adequate base for cognitive theory” (p. 17). 


A similar argument was offered by Jungermann 
(1983). 

The problem is that when a bias (error) exists, 
it is difficult to establish a logical connection to 
any particular heuristic. That is because many 
heuristics may lead to the same bias. Consider 
the following “heuristic” explanations for ignor- 
ing base rates: 

(1) recency order effects may lead to under- 
weighting of earlier information (base rates) and 
overweighting of later information (case 
specific); 

(2) subjects may misunderstand the instruc- 
tions or be confused by the word problem and so 
rely on the easier-to-understand case-specific in- 
formation, 

(3) the base-rate information lacks salience 
for the subject and is ignored in favor of the 
more relevant case-specific information; and 

(4) memory factors may lead to forgetting or 
overlooking base rates and so leave only case 
specific material available. 

Only one of these explanations (identified 
below) corresponds to the representativeness 
heuristic used by Tversky & Kahneman to ac- 
count for base-rate effects. But what is wrong 
with the others? The problem is that heuristics 
are offered post boc as an explanation of biases 
(Schwartz & Griffin, 1986). Representativeness 
may, or may not, provide the best account of the 


observed effect. Without further evidence, there 


is no way to know. - 

. Although the concept of judgmental heuris- 
tics seems compelling, the connection between 
representativeness (or other heuristics) and 
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specific errors has yet to be firmly established. 
Therefore, the status of judgmental heuristics is 
problematic. 

(In the base-rate example above, explanation 
1 reflects order or serial-position effects, expla- 
nation 2 represents instruction or context ef- 
fects, explanation 3 corresponds to representa- 
tiveness and explanation 4 is based on memory 
or availability factors. Potentially, each could be 
used to explain why subjects might ignore base 
rates. ) 


FUTURE DIRECTIONS 


Although making predictions about the future 
is presumptuous, if not foolish, I believe that 
judgment and decision research generally and 
auditing research specifically needs to move in 


several directions. 


(1) There is a need for more and better ex- 
periments on decision processes, as opposed to 
demonstrations of heuristics (Wallsten, 1983). 
Ideally, these experiments will be theory driven 
and based on quantitative concepts and models. 

(2) The goal should be to understand, not just 
describe, judgment and decision processes 
(Qungermann, 1983). Heuristics often provide 
interesting behavior descriptions, but so far have 
shed little understanding on the underlying 
psychological processes. 

(3) This increased understanding should be 
applied to improving decision-making through 
training or decision aids. Without an adequate 
understanding of underlying decision processes, 
efforts orientated towards debiasing will con- 
tinue to be unrewarding (Wright, 1984). 

(4) Unless more convincing evidence is of- 
fered, there does not appear to be much future 
for the heuristics and biases approach in be- 
havioral auditing research. Instead, the emphasis 
should be on research which addresses the un- 
ique concerns of accountants and auditors. 


FINAL COMMENTS 


Let me conclude with several comments. It is 
important to acknowledge the many contribu- 
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tions of Kahneman & Tversky. They have stimu- 
lated tremendous interest in the field of judg- 
ment and decision-making. Because of their 
work, many more people now know about the 
area than ever before. This has brought a number 
of new investigators into the field, both in 
psychology and in accounting and auditing. 
Although this paper might be seen as an attack 
on their work, that was not the goal. Rather, the 
purpose was to identify the limitations of the 
heuristics and biases approach and to look ahead 
to a new era. In the future Kahneman & 
Tversky's research is likely to be viewed as an 


important transition from the narrow concerns 
of the past to the broader perspectives of the fu- 
ture. 

Finally, I look forward to the day when we can 
be as enthusiastic about good decision behavior 
as we have been about poor behavior. Be- 
havioral auditing is replete with examples of 
positive performance. I believe the future of 
judgment and decision research lies in under- 
standing the sources of such exemplary be- 
havior. Auditing researchers may well lead the 
way in that effort. 
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Abstract 


This paper has two purposes: (1) to provide a conceptual basis for empirical research on auditors’ causal 
judgments by reviewing the philosophical and psychological literatures on causality, and (2) to report two 
experiments on auditors’ information processing when making causal judgments. The experiments 
examined the association between the direction of inference (forward inference from cause to effect vs 
backward inference from effect to cause) and type of information about a causal relation (conditionality vs 
multiplicity). The results supported the hypotheses that auditors place a significant weight on conditional- 
ity information when making forward inferences and a significant weight on multiplicity information when 


making backward inferences. 


Many of the tasks that comprise the audit opin- 
ion formulation process require an auditor to 
explain observed events or predict outcomes 
given observed or assumed conditions. Judged 
causal relations play important roles in such exp- 
lanations and predictions (Libby, 1981). For 
example, when preliminary analytical proce- 
dures reveal a large difference between an 
account’s book and expected audit values, a 
judgment about the cause of the difference is re- 
quired before an appropriate decision about 
audit effort for the account can be made (Kin- 
ney, 1979; Libby, 1985). In statistical sampling 
applications, the practice of “isolating” errors re- 
quires a judgment about their cause 
(Burgstahler & Jiambalvo, 1986). Audit planning 
requires predictions of how client—environmen- 
tal factors might cause problems in the conduct 
of the audit (Gibbins & Wolf, 1982; Waller & 
Felix, 1984a). A prediction about a client’s con- 
tinued existence also may deperid on judged 
causal relations (Kida, 1984). ~ 





This paper has two purposes. The main pur- 
pose is to report two experiments on auditors’ 
information processing when making causal 
judgments. Regarding the relation between a 
cause, X, and an effect, Y, an important task vari- 
able is whether a forward inference from X to Y 
or backward inference from Y to X is being 
made (Toda, 1977; Bindra et al., 1980; Burns & 
Peari, 1981; Bjorkman & Nilsson, 1982). An im- 
portant informational variable pertains to the 
distinction between the multiplicity and condi- 
tionality of causality (Einhorn & Hogarth, 1981, 
1986). Multiplicity refers to possible alternative 
causes of Y other than X. Conditionality refers to 
the conditions that combine with X to bring 
about Y. Following Einhorn & Hogarth (1981, 
1986), it was hypothesized that: Gy auditors 
place a significant weight on conditionality in- 
formation when making forward causal infer- 
ences, and (2) auditors place a significant weight 
on multiplicity information when making back- 
ward casual inferences. These hypotheses were 
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tested in two experiments which used practic- 
ing auditors as subjects. 

The other purpose is to provide a review of 
the philosophical and psychological literatures 
on ordinary or non-scientific notions of causal- 
ity. A review of the psychological literature is 
needed as a conceptual basis for empirical 
inquiry of auditors’ causal judgments, given the 
limited though increasing amount of related au- 
diting research (Kida, 1984; Libby, 1985; Ander- 
son & Wright, 1986; Ho & May, 1987). Areview 
of the philosophical literature is also necessary 
because, somewhat surprisingly, the psycholog- 
ical research is highly dependent on philosophi- 
cal analyses. 
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LITERATURE REVIEW 


The literature review is organized in terms of 
the summary statements shown in Table 1. 


Causality in philosophy 


The concept of causality ts a central notion 
in ordinary thought and language. Philos- 
ophers since Aristotle have recognized the im- 
portance of the concept of causality in ordinary 
life, and accordingly have produced many 
analyses that attempt to explicate the concept 
(for anthologies see Beachamp, 1974; Sosa, 
1975, Brand, 1976).' Hume (1740) contended 


` TABLE 1. Literature review summary statements 





Causality in philosophy 


1. The concept of causality is a central notion in ordinary thought and language. 

2. Cause and effect are distinguishable events. 

3. Causality is indicated, but not infallibly, by the cues of constant conjunction, 
temporal priority and spatiotemporal contiguity. 

4. Effects at molar levels may have multiple causes. 

5. A cause and attendant conditions are distinguishable. 

6. Causal judgments depend on the task, context and individual. ; 

7. The primitive notion of causality involves an action that makes something 


happen. 


8. Causality and probability are compatible concepts. 


Causality in psychology 


1. People rely on causal knowledge for purposes of comprehension, explanation, 


prediction and controL 


2. A propensity for causal inference is manifest early in life. 

3. Causal judgments are cue-based judgments under uncertainty. 
4. Causal judgments depend on the task, context and individual 
5. Over-reliance on causal knowledge may lead to biases. 





The remainder of the paper is organized as fol- 
lows. The second section presents the literature 
review. The third section develops and states the 
two hypotheses. The fourth and fifth sections re- 


port the experiments. The last section contains. 


some concluding remarks. 





that “all reasonings concerning matter of fact 
are founded on the relation of cause and effect” 
(p. 186) and that causality is “to us the cement of 
the universe” (p. 198). Mill (1843, p. 213) refer- 
red to the notion of cause as “the root of the 
whole theory of Induction.” Blanshard (1962, p. 


‘Aristotle identified four senses of cause: (1) an efficient cause is an agent that produces a change in something; (2) a final 
cause is the purpose which a change in something is intended to accomplish; (3) a material cause is something that is 
changed; and (4) a formal cause is that into which something is changed (Taylor, 1967). For example, when producing a 
good, a manufacturer (efficient cause) changes raw material (material cause) into a finished good (formal cause) for the pur- 


pose of making a profit (final cause). 
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444) described causality as “the lifeline connect- 
ing us with the world.” Many other philosophers 
have stressed that the concept of causality is in- 
dispensable in applied domains such as 
medicine (Collingwood, 1940), jurisprudence 
(Hart & Honore, 1959) and engineering (Bunge, 
1982). 

Not all philosophers agree with this position. 
Russell (1912—13, p. 1) characterized causality 
as “a relic of a bygone age, surviving, like the 
monarchy, only because it is erroneously sup- 
posed to do no harm” and called for “its com- 
plete extrusion from the philosophical vocabul- 
ary.” A less extreme and more representative 
view is that causality is important in ordinary life 
but inappropriate in science: “As scientific 
modes of investigation develop, the language of 
cause tends to its own supersession. ... Scientific 
insight is the death of causal conceptions” 
(Black, 1958, p. 29). Indeed, due to various influ- 
ences including quantum theory with its prob- 
abilistic view and logical positivism with its anti 
metaphysical aims, causality in science virtually 
disappeared in the first half of this century. How- 
ever, since about 1960, there has been a re- 
surgence of interest in.causality in various phys- 
ical and social sciences, including even physics 
(Bunge, 1979, 1982).” 


Cause and effect are distinguishable events. 
Unlike a functional relation which permits bi- 
directional inferences, a causal relation is asym- 
metric. Causes bring about effects in such a way 
that effects cannot be said to bring about causes. 
Causality may be “reversible” between classes of 
events (e.g. the auditee’s financial problem last 
year caused last year’s qualified opinion which in 
turn caused the auditee’s financial problem this 
year, and so on), but not between a given cause 
and effect (Cook & Campbell, 1979). 

While this statement may seem innocuous, it 
has been a major stumbling block in attempts to 
define causality. Consider the “essentialist” de- 
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‘finition of causality: X caused Y means that X was 


necessary and sufficient for Y. Necessity means 
that Y cannot occur when X does not, and it per- 
mits a certain inference from Y to X. Sufficiency 
means that Y must occur when X occurs, and it 
permits a certain inference from X to Y. Certain 
inferences in either direction are permitted only 
when X was necessary and sufficient for Y. But, 
if X (Y) was necessary for Y (X), then Y (X) was 
sufficient for X (Y). Thus, if X was necessary and 
sufficient for Y, then Y was necessary and suffi- 
cient for X, and the definition fails to distinguish 
cause and effect (Taylor, 1963). 


Causality is indicated, but not infallibly, by 
the cues of constant conjunction, temporal 
priority, and spatiotemporal contiguity. 
Philosophers before Hume defined causal rela- 
tions using ideas such as power, force, efficacy 
and necessary connection. Hume argued that 
there is no empirical, and thus no justifiable, 
basis for such ideas. Instead, Hume defined caus- 
ality in terms of three observable cues or 
criteria: (1) The cause and effect are contiguous 
in time and space. (2) The cause precedes the ef- 
fect. (3) There is a constant conjunction of cause 
and effect’ (i.e. similar causes always produce 
similar effects). Although all three criteria are in- 
formative, none is infallible. 

Regarding contiguity, cause and effect may be 
separate in space, e.g. the Vietnam War caused 
student unrest on American campuses (Brand, 
1976). Similarly, cause and effect may be sepa- 
rate in time, e.g. mixing two colorless chemicals, 
potassium iodate and sulphurous acid, produces 
a mixture with a vivid color, but only after a 
lapse in time (Ducasse, 1951). To account for 
such cases, the criterion must be modified by 
presuming an implicit causal chain of contigu- 
ous events. Although such “micromediation” 
may always be implicit, it is rarely made explicit 
in practical contexts, and meaningful causal in- 
ferences at molar levels do not require know- 


Scientific research without causal models suggests the predicament of a problem solver who Is not permitted to formally rep- 
resent 2 problem in a manner that is consistent with the way he or she thinks about it (cf. Blalock, 1964; Toda, 1977} 
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ledge of it (Cook & Campbell, 1979). Further, 
contiguity makes causality dependent on 
specified boundaries. In the statement “Used as a 


magnet, the lodestone caused the iron filings to. 


move” the relevant boundary is the lodestone’s 
magnetic field, but in the statement “Used as a 
paperweight, the lodestone caused the papers to 
‘Stay put” the relevant boundary is the lodes- 
tone’s material surface. In this example, con- 
tiguity is defined in terms of the causal relation 
rather than vice versa (Ducasse, 1951). 
Regarding temporal priority, although few 
would argue that an effect can precede its cause 
(Dummett, 1954; Flew, 1954; Chisholm & 
Taylor, 1960), cause and effect may occur simul- 
taneously, e.g. the motion ofa writer’s hand may 
cause the motion of a pen, though both move 
simultaneously. Indeed, it may be argued that a 
cause and its effect must be simultaneous: 


Consider, then, the case of a window breaking as a result 
of a stone being thrown against it. Here it is tempting to 
say that the stone is first thrown, and then the window 
breaks, implying that the cause occurs before the effect. 
But that is not a good description of what happens. It is 
not enough that the stone should be thrown; it must hit 
the window. Only then does the window break; cause 
and effect are simultaneous (Taylor, 1963, p. 311). 


To modify the criterion such that an effect can-. 


not precede its cause is inadequate, since this 
modification would not always distinguish cause 
and effect. 

Regarding constant conjunction, some such 
instances are not causal, e.g. day followed by 
night and the correlation of pigs and pig iron. 
Further, causal relations in unique cases are 
problematic. A unique cause and effect are “con- 
stantly conjoined” not only with each other but 
also with every other past event that has occur- 
red only once. Finally, in the phrase “similar 
causes always produce similar effects,” what 
does similarity mean? It cannot mean exactly 
similar, so it must mean similar only in terms of 
certain features. But, in specifying those fea- 
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tures, it is difficult to avoid the circularity that 
the events of a given class are similar in all caus- 
ally relevant features. 

In sum, although causality may be an illusory 
glue meant to bind events between which there 
is no real gap (Schlick, 1932), the concept in or- 
dinary thought and language seems to involve 
something more than that which Hume’s criteria 
provide. Whether causality is objective or a pro- 
duct of the mind is an issue that is still in dispute 
among philosophers (Putnam, 1984). 


Effects at molar levels may have multiple 
causes. In ordinary thought and language, it is 
natural to suppose that a given sort of effect can 
be brought about in different ways, i.e. the effect 
has a “plurality of causes” (Mill, 1843). For ex- 
ample, financial statement errors may be caused 
by misrecording transactions, computational 
errors, incorrect adjusting entries, and so on. In- 
deed, many causal inquiries involve sifting 
through and eventually identifying one from a 
set of alternative possible causes. Nevertheless, 
the plurality of causes doctrine is controversial. 
First, it may be an artifact of how effects are 
categorized at molar levels. The apparent plural- 
ity of causes may reduce to a single cause when 
the effect is more precisely described (Hart & 
Honore, 1959). However, the relevant level of 
categorization depends on the purpose at hand, 
and as long as molar causal laws serve that pur- 
pose, the effect may have multiple causes (Cook 
& Campbell, 1979). Second, it employs a cause- 
as-sufficient-condition definition, so that the 
role of necessity is put in question. The cause-as- 
necessary-and-sufficient-condition definition is 
applicable only if the cause is specified as the 
disjunction of multiple sufficient conditions. 
The cause-as-necessary-condition definition. 
must be modified such that the cause is neces- 
sary in the circumstances. But, this modification 
may be questioned when there is overdetermi- 
nation, Le. two or more sufficient causes are pre- 


Cook & Campbell (1979, p. 32) distinguish molarity and micromediation as follows: “the term molar refers to causal laws 
stated in terms of large and often complex objects. Micromediation refers to the specification of causal connections at a level 
of smaller particles than make up the molar objects and on a finer time scale.” 
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sent (Scriven, 1975).* Despite the controversy, 
it is clear that the plurality-of causes frequently is 
a practical concern in applied domains. 


A cause and attendant conditions are distin- 
guishable, In many cases, what is sufficient for 
an effect is not seen as a single condition but in- 
stead as a complex set of conditions. For ex- 
ample, an observed audit failure was brought 
about by the conjunction of an error being pre- 
sent, inadequate internal control, lack of detec- 
tion by the auditor, economic loss by the finan- 
cial statement user, and so on. In such cases, it 
might be argued that the conjunction of condi- 
tions, rather than just one conjunct, is the cause: 
“The real Cause is the whole of these ante- 
cedents; and we have, philosophically speaking, 
no right to give the name of cause to one of them 
exclusively of the others” (Mill, 1843, p. 214). 
Notwithstanding Mill, attempts to distinguish 
the cause from attendant conditions are com- 
mon in ordinary thought and language. 

Mackie (1965, p. 245) refers to the cause as an 
INUS condition, Le. “an insufficient but neces- 
sary part of a condition which is itself unneces- 
sary but sufficient for the result.” More pre- 
cisely, let X = an INUS condition, C = other con- 
ditions such that the conjunction of C&X is a 
minimal sufficient condition for an effect Y, and 
M = the disjunction of possible minimal suffi- 
cient conditions for Y other than C&X. In these 
terms, the statement “X caused Y” means: (1) X 
was an INUS condition (assuming C was not 
null), (2) C, X and Y occurred, and (3) none of 
the disjuncts in M occurred (overdetermination 
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was not present). The concept of an INUS condi- 
tion is useful because it incorporates both the 
plurality of causes doctrine and the distinction 
between cause and attendant conditions. How- 
ever, it raises the question as to how this distinc- 
tion is to be made. Most answers to this question 
emphasize the role of context, which is discus- 
sed below. 


Causal judgments depend on the task, con- 
text, and individual The task influences which 
aspect of causality is relevant (Copi, 1978). 
When the task is to predict (explain) an effect, 
the primary focus is on sufficiency (necessity). 
When the task is to produce (prevent) an effect, 
the primary focus is on manipulable sufficient 
(necessary ) conditions. In addition, the task in- 
fluences the relevant level of molarity for causal 
inquiries (Cook & Campbell, 1979). 

Many philosophers have emphasized the im- 
portance of context in judging causality. 
Ducasse (1951).asserted that a causal relation 
has three terms, i.e. X caused Y in S, where S is 
some set of circumstances. Anderson (1938) 
contended that causal inquiries are not con- 
cerned with the relation between X and Yin gen- 
eral but with respect to some field, which is 
“acted upon” and divided into two parts, one in 
which the effect occurs and another in which it 
does not’ Mackie (1965) similarly noted that 
causal statements such as “X caused Y” may be 
expanded into statements such as “X caused Y 
with respect to an assumed field F.” The field 
helps to distinguish a cause and attendant condi- 
tions, since the former divides the field while (at 


“The following examples from Mackie (1974, p. 44) are illuminating: 


(A) A man is shot dead by a firing squad, at least two bullets entering his heart at once, either of which would have been 


immediately fatal. 


(B) A man sets out on a trip across the desert. He has two enemies. One of them puts a deadly poison in his reserve can 
of drinking water. The other (not knowing this) makes a hole in the bottom of the can. The poisoned water all leaks out 
before the traveller needs to resort to this reserve can; the traveller dies of thirst. 


Example A is a clear case of overdetermination in which it is impossible to answer the question as to which bullet caused 
death. Example B is only an apparent case of overdetermination. When the effect is defined as death-by-thirst rather than 
merely death, it is straightforward to construct a causal chain from can-puncturing, but not from poisoning, to the effect. 

*Anderson’s (1938) analysis corresponds loosely with that of Aristotle (see note 1). That is, an efficient cause (X) acts upon 


the material cause (the field) to produce a formal cause (Y). 
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least a subset of) the latter are constant through- 
out it. Also, a shift in the field may change the re- 
lation between X and Y. More precisely, X 
caused Y in F,, X caused Y in F,, and so on, are 
different causal relations. 

An individual’s role, purpose, or prior knowl- 
edge also may affect the assumed field. In cases 
where the effect to be explained is an abnormal 
or unexpected event, the cause must also be an 


abnormal or unexpected event in the situation, 


since the normal course of events consists of 
merely noncausal conditions. Whether a given 
event is considered to be normal depends on 
both the individual and context (Hart & Honore, 
1959; Gorovitz, 1965). Hart & Honore (1959, p. 
34) provided the following example: 


So the wife of the man with the ulcerated stomach, who 

. looks upon the parsnips as the cause of his indigestion, in 
asking what has given him indigestion, is in fact asking: 
“What has given this man in his condition indigestion 
when usually he gets by without it?” The doctor who 
gives the man’s ulcerated stomach as the cause ap- 
proaches the case with a wider outlook and a different set 
of assumptions; .. His question (in contrast with the 
wife's) is: “What gave this man indigestion when other 
men do not get it?” 


The wife and doctor identify different causes be- 
cause they assume different fields. 


The primitive notion of causality involves 
an action that makes something bappen. 
Numerous philosophers have defined causality 
in terms of human activity that produces or pre- 
vents something in nature (e.g. Collingwood, 
1940; Gasking, 1955; Black, 1958). Collingwood 
(1940, pp. 296-297) defined a cause as “an 
event or state of things which it is in our power 
to produce or prevent, and by producing or pre- 
venting which we can produce or prevent that 
whose cause it is said to be.” Causal knowledge is 
knowing how to bring about an effect, e.g. the 
cause of the financial statement error was a 
bookkeeper’s misrecording a significant transac- 
tion. Such knowledge provides humans a special 
survival value and basis for experimentation in 
modern science (Cook & Campbell, 1979). 
Gasking (1955) similarly referred to causes as 
“recipes” for producing results. Humans dis- 
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‘cover that manipulating certain objects in cer- 


tain ways under certain conditions produces 
certain results. Through such discoveries, they 
may develop a general manipulative technique 
for producing a class of events, X. If producing X 
also produces another class of events, Y, but not 
vice versa, then it may be said that X causes Y. 
Black (1958) défined causality in terms of “mak- 
ing something happen” and provided the follow- 
ing paradigm case: 


You are thirsty, but there is a glass of beer within easy 
reach; you stretch out your hand, bring the glass to your 
lips, and drink. Here is what I call a perfectly clear case of 
making something happen (1958, p. 15). 


Such a paradigm case can be used as a baseline 
for judging the causal nature of less clear cases as 
well as a point of departure for derivative cases 
such as attributing powers to natural 
phenomena. 

This notion of causality easily distinguishes a 
cause and its effect, even when they occur simul- 
taneously, since a person produces the effect by 
producing the cause, but not vice versa. The 
cause also can be distinguished from attendant 
conditions, e.g. in the statement “Striking the 
match caused the flame” “striking the match” is 
clearly the cause even though other conditions 
(presence of oxygen, dry match, etc.) also were 
necessary. Another advantage is that a causal re- 
lation in a unique case may be judged by com- 
paring the case’s features with a paradigm case. 
However, the notion has limited applicability, 
since there are no general manipulative tech- 
niques for some cases of causality. In the state- 
ment “The melting of the polar ice cap was 
caused by the increased heat of the sun” both 
cause and effect are beyond the control of 
humans. Also, when two or more general: tech- 
niques are applicable in a given case, the notion 
provides no basis for isolating that which is 
causal (Rosenberg, 1973). 


Causality and probability are compatible 
concepts. Causal judgments often involve uncer- 
tainty, such that probabilistic views are relevant. 
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regardless of one’s assumption about deter- 
minism.° In a deterministic world, uncertainty in 
judging causality may arise from several sources. 
First, an individual's causal knowledge may be 
incomplete. Using Mackie’s (1965) terms, an in- 
dividual may lack complete knowledge about all 
minimal sufficient conditions and the INUS con- 
ditions that comprise them. Indeed, when molar 
causal laws are adequate for the purpose at hand, 
a lack of knowledge about micromediation is to 
be expected (Cook & Campbell, 1979). Second, 
in a given situation, an individual may lack com- 
plete information on the conditions that are pre- 
sent. Third, there may be overdetermination in 
the situation, creating uncertainty about which 
of two or more alternatives is the cause. Uncer- 
tainty from such sources would be the basis for 
a probabilistic theory of deterministic causality. 
But, a probabilistic theory of deterministic 
causality should be distinguished from a theory 
of probabilistic causality (Rosen, 1982-83). In 
light of evidence on radioactive decay indicating 
that there are fundamentally probabilistic 
phenomena, some philosophers have dropped 
the assumption of determinism in their analyses 
of causality (Suppes, 1970; Cartwright, 1979; 
Salmon, 1980, 1984; Rosen, 1980, 1982-83). 
Suppose that ¢, 7’, and ?” are points in time, with 
t preceding tand f” preceding ¢’. X, is the cause 
of Y, ifand only if the following conditions hold: 


P(Xy) > 0, 
PY Xe) > PCY), and 
PO Xp Ze) > P(Y Ze), 


where Zp includes all events prior to the occurr- 
ence of Xp. That is, a cause must increase the 
probability that its effect will occur, and this in- 
crease must not be attributable to a third factor 
which preceded the cause. At present, con- 

’ troversy surrounds the issues of whether the as- 
sumption of determinism should be abandoned 
and how to explicate probabilistic causality 
(Hesslow, 1976, 1981; Rosen, 1978; Salmon, 
1980; Suppes, 1984). 
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Causality in psychology 

People rely on causal knowledge for pur- 
poses of comprebension, explanation, predic- 
tion, and control. People have knowledge struc- 
tures, or schemata, that are used in comprehend- 
ing stimuli, explaining past events, predicting 
future events, collecting information, and con- 
trolling outcomes (Neisser, 1976; Rumelhart, 
1980; Abelson, 1981). A subset of these 
schemata represent knowledge about causal re- 
lations (Tolman & Brunswik, 1935; Heider, 
1958; Piaget, 1960; Kelley, 1972). Causality in 
human thought and behavior has been examined 
in many areas of psychology including develop- 
mental psychology (Piaget, 1960; Michotte, 
1963), social psychology (Kelley & Michela, 
1980; Hastie, 1983), reasoning (Bindra et al., 
1980; Schustack & Sternberg, 1981), and be- 
havioral decision making (Tversky & Kahneman, 
1980, 1983; Einhorn & Hogarth, 1981, 1986). 
The remainder of this section presents a select- 
ive view of the psychological literature, with a 
heavy emphasis on the work of Einhorn & 
Hogarth (1981, 1986). The reader will note the 
high degree of dependence of the psychological 
research on the philosophical literature re: 
viewed above. 


A propensity for causal inference is mantfest 
early in life. Piaget (1960) proposed that the 
child develops causal notions along with notions 
of external reality. The chiid gradually proceeds 
from an initial state in which there is no self/envi- 
ronment distinction to a state in which what 
comes from the self and what forms external 
reality are mostly, though not entirely (even in 
adulthood), separate. Fragments of internal ex- 
perience or adberences remain, e.g. believing in 
magic and endowing objects with consciousness 
or internal power. Developing causal knowledge 
consists in the progressive elimination of adher- 
ences and objectification of causal relations. 
Subsequent developmental research has 
examined the information processing underly- 
ing causal judgments at different ages, especially 


“Under a deterministic view, all effects have causes. Thus, if Y occurs given X in S but does not occur given a similar X’ in S’, 
then there must be a relevant difference between X and X' or between S and S’ (Anscombe, 1975). 
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the use of temporal order, constant conjunction, 
and contiguity cues (Sedlak & Kurtz, 1981). The 
temporal order of events is a powerful cue in 
causal inference, even for three-year-old chil- 


dren (Bullock & Gelman, 1979). Very young 


children also rely on constant conjunction, but 
less consistently than older children do (Shultz 
& Mendelson, 1975). Gaps in temporal or spatial 
contiguity dramatically decrease children’s abil- 
ity to use the constant conjunction cue (Siegler, 
1975). Thus, children use more or less reliably 
the Humean cues, even at very young ages. But, 
they also rely on cues of questionable validity. 
For example, similarity (in terms of sound, color, 
etc.) of cause and effect has some influence on 
causal inference, at least for very young children 
(Siegler & Ravinsky, 1977). In contrast with the 
view that perceptions of causality require 
gradual development or prolonged experience, 
Leslie and Keeble (1987) reported that even 27- 
week-old infants can detect’ causal relations 
using a low level perceptual mechanism. 


Causal judgments are cue-based fudgments 
under uncertainty. Tolman & Brunswik (1935) 
stated that psychology is concerned with an 
organism’s response to two features of the envi- 
ronment: (1) its causal texture in which events 
depend on each other and (2) the equivocality 
of this dependence. The organism uses available, 
probabilistic information to make inferences 
about the causal texture and relies on such infer- 
ences to achieve its goals. Also employing a 
Brunswikian approach, Einhorn & Hogarth 
(1986) proposed that causal judgments depend 
on the use of a bundle of cues (i.e. covariation, 
temporal order, spatiotemporal contiguity and 
similarity), where each cue is only a fallible indi- 
cator of causality. Due to cue fallibility, causal 
judgments are judgments under uncertainty. 

Most of the cnes-to-causality in the Einhorn & 
Hogarth ( 1986} framework have been discussed 
above; however, the covariation cue warrants 
further attention. The Humean criterion of con- 
stant conjunction and essentialist definition of 
causality require that, whenever X is present, so 
is Y, and whenever X is absent, so is Y. In other 
words, observations must appear only in cells a 
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and d ofa 2X2 table (Table 2). But from a Bruns- 
wikian perspective, causation does not imply 
perfect covariation (Einhorn & Hogarth, 1986). 
As noted earlier, a cause in ordinary thought and 


TABLE 2. 2X2 table for cause and effect 











Y 
Present Absent 
Present a b 
x . 
Absent c d 





language is not adequately defined as a necessary 
and sufficient condition. Rather, a cause is a con- 
dition that conjoins with other conditions to 
bring about Y (cf. Mackie, 1965). When the 
other conditions are absent, Y does not occur 
even though X does; cell b in Table 2 reflects the 
conditionality of causality. As cell b increases, 


. the judged causal relation between X and Y may 


weaken but does not necessarily disappear, 
holding other cues-to-causality constant. 
Analogously, when X is absent but some other 
sufficient condition is present, Y occurs even 
though X does not; cell c in Table 2 reflects the 
multiplicity of causality. Again, the judged 
causal relation may weaken but does not neces- 
sarily disappedr. 

Besides cue fallibility, other sources of uncer- 
tainty in causal judgments include an indi- 
vidual’s awareness of his or her lack of knowl- 
edge about all multiple sufficient conditions, 
INUS conditions that comprise them, and in- 
teraction effects when more than one causal 
schema may be applicable. Such uncertainty is a 
basis for subjective probability judgments. As 
observed by Einhorn & Hogarth (1981, p. 31), 
“uncertainty arises from knowing that you don’t 
and probability attempts to quantify this.” Piaget 
& Inhelder (1975, pp. xvii-xviii) similarly noted 
from a developmental perspective that “the idea 
of chance and intuition of probability constitute 
almost without a doubt secondary and derived 
realities, dependent precisely on the search for 
order and its causes.” The role of causality in 
probability judgments is discussed further 
below. 


AUDITORS’ CAUSAL JUDGMENTS 


Causal judgments depend on the task, con- 
text and individual. Reliance on the temporal 
priority cue when judging the causal relation be- 
tween X and Y does not imply that the judged 
causal relation is useful only for inferences from 
X to Y. On the contrary, causal schemata facili- 
tate both forward and backward inferences. 
Further, schematic development depends on at- 
tempts to associate observed effects with prior 
events as well as attempts to predict or produce 
outcomes. The interdependence of forward and 
backward inferences is well-captured by Kier- 
kegaard’s statement: “Life can only be under- 
stood backwards; but it must be lived forwards” 
(quoted in Einhorn & Hogarth, 1982, p. 2). 

The direction of inference affects task diffi- 
culty. Tversky & Kahneman (1980) proposed 
that forward inferences are more natural and 
easier than backward inferences and reported 
data supporting the hypothesis that individuals 
are more confident when making forward infer- 
ences. Bjorkman & Nilsson (1982) reported that 
subjects learned to perform a forward task more 
easily than a corresponding backward inference 
task. Bindra et al. (1981) found that (children) 
subjects’ forward inferences were more accu- 
rate than their backward inferences. However, 
Burns & Pearl (1981) found no such difference.” 
In related research, Fischhoff (1975) attributed 
subjects’ problems with prediction to the low 
quality of their explanations. The overall conclu- 
sion would seem to be that forward inferences 
are less difficult than backward inferences. How- 
ever, before generalizing this conclusion, it is 
important to consider other factors, e.g. the con- 
ditionality and multiplicity of causality: 

(W)e can explain but not predict, whenever we have a 

proposition of the form “The only cause of Yis X” [1] ... 

Notice that this is perfectly compatible with the state- 

ment that X is often not followed by Y.... Hence, when 

X is observed, we can predict that Y is more likely to 

occur than without X, but still extremely unlikely. So we 

must, on the evidence, still predict that it will not occur. 
But if it does, we can appeal to [1] to provide and guaran- 
tee our explanation (Scriven, 1959, p. 480). 
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When there are few or no alternative causes and 
many conditions must conjoin with X to pro- 
duce Y, backward inferences may be much less 
difficult than forward inferences (cE Einhorn & 
Hogarth, 1981). 

The direction of inference also may affect the 
information processing that underlies causal 
judgments (Einhorn & Hogarth, 1981, 1986). As 
stated earlier, an individual’s degree of uncer- 
tainty about the causal relation between X and Y 
in a given situation is in part due to incomplete 
causal knowledge, and thus it may be affected by 
processing situation-specific information re- 
garding conditionality or multiplicity. The 
nature of the information processing effect de- 
pends on whether the task involves a backward 
or forward inference. When a backward infer- 
ence is made, the focus is on whether Y would 
have occurred if X had not (cf Mackie, 1974), 
such that multiplicity information is more rele- 
vant (Lipe, 1985). When a forward inference is 
made, the focus is on whether X will produce Y, 
such that conditionality information is more 
relevant (Schustask & Sternberg, 1981). The as- 
sociation between the direction of inference and 
auditors’ information processing is discussed in| 
the next section. 

As discussed earlier, contextual/individual 
variables affect causal judgments through the as- 
sumed causal field. A dramatic effect of role on 
causal judgments, sometimes referred to as the 
“fundamental attribution error,” has been 
documented in the social psychology literature 
(Nisbett & Ross, 1980): an observer tends to at- 
tribute an actor’s behavior to internal, disposi- 
tional causes (e.g. personality) while an actor 
tends to attribute his or her own behavior to ex- 
ternal, situational causes (e.g. social pressure). 
Another contextual/individuai effect concerns 
the multiplicity of causality. Multiplicity de- 
pends on both an individual’s prior causal 
knowledge and the extent to which the situation 
rules out alternative causes. As the number and/ 
or strength of alternative causes increase, the 


"Both Tversky & Kahneman (1980) and Burns & Pearl (1981) regrettably refer to forward (backward) inferences as causal 
(diagnostic) judgments. The position taken here is that both forward and backward inferences may involve causality. 
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“net” strength of the causal relation between the 
effect and any given alternative cause is dis- 
counted (Einhorn & Hogarth, 1986; Kelley, 
1973). For example, when judging the cause of 
company A’s financial problems, learning that its 


competitors have similar problems would not 


rule out an industry—-economic cause, such that 
the causal relation between company A’s finan- 
cial problems and the competence of its internal 
management would be discounted. Finally, in- 
formation processing when judging causality 
may depend on whether the context is general 
-or unique (Einhorn & Hogarth, 1981). Covaria- 
tion data may be available for use when judging 
a causal relation in general, whereas causal 
schemata must be relied on when judging a 
causal relation in a unique situation (Kelley, 
1972). , 


Over-reliance on causal knowledge may 
lead to biases. People sometimes apply their 
causal schemata in cases where non-causal, 
probabilistic models are more appropriate’ 
(Brehmer, 1980; Nisbett & Ross, 1980). Tversky 
& Kahneman (1980) reported that subjects 
making conditional probability judgments over- 
stated the informativeness of the conditioning 
event when it was perceived to be a cause of the 
target event. Tversky & Kahneman (1983) con- 
cluded that subjects’ reliance on causal 
schemata led to violations of probability theory’s. 
conjunctive rule. On the other hand, Bar Hillel 
(1980) suggested that the tendency to ignore 
base rates is mitigated when they are perceived 
to be specific, causally relevant information. 
Finally, Fischhoff (1975) found a “hindsight 
bias” whereby processing outcome information 
led to a reduction in the outcome’s perceived 
surprisingness. Past outcomes may be readily 
explainable by an individual’s existing causal 
schemata such that the need to learn from ex- 
perience is erroneously perceived to be low. 


HYPOTHESES 


Recent auditing research includes several 
models that attempt to describe the cognition 
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underlying auditors’ judgments (Gibbins, 1984, 
Waller & Felix, 1984a, 1984b). In conjunction 
with the philosophical and psychological re- 
search reviewed above, such models provide a 
basis for examining auditors’ information pro- 
cessing when making causal judgments. It is as- 
sumed that the storage and use of an auditor’s 
knowledge about the world involve cognitive 
structures, or schemata (Waller & Felix, 1984a, 
b). For a class of phenomena, a schema specifies 
conditions that are common to all class mem- 


.bers, variables that may assume different values 


for different class members, and relations be- 
tween the conditions and/or. variables. In long- 
term memory, a schema’s variables have default 
values that represent central tendencies. When 
activated by cues from the audit task or context, 
a schema may be used in at least three ways. 
First, default values may be retrieved from a 
schema. Second, a schema may guide a search for 
new information in the environment, and cur- 
rently observed values may replace default val- 
ues. Third, inferences about currently unob- 
served values for some variables may be made 
based on currently observed values for other 
variables and known relations. In these terms, 
the cognition underlying a particular audit judg- 
ment involves the activation of a relevant 
schema and its instantiation, Le. assignment or 
re-assignment of values to a schema’s variables. 
Auditors’ causal judgments may be similarly 
described. When performing many explanation 
and prediction tasks, an auditor activates and in- 
stantiates a relevant causal schema. Following 
Mackie (1965) and Einhorn & Hogarth (1981, 
1986), it is assumed that an auditor's causal 
schema for a class of phenomena (e.g. errors in 
Accounts Receivable) includes: (1) a variable or 
subschema representing the effect, (2) variables 
or subschemata representing multiple sufficient 
conditions for the effect, each of which may de- 
compose into an array of INUS conditions, (3) 
fixed conditions which are always present for 
the phenomena represented by the schema, and 
(4) relations between these variables and condi- 
tions. Further, with respect to causal relations in- 
volving either unique events or classes of events, 
it is assumed that: (1) an auditor's causal 
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‘schemata are incomplete in that not all multiple 
sufficient conditions or INUS conditions are rep- 
resented precisely and (2) he or she is aware of 
this incompleteness. This incompleteness is a 
source of uncertainty in his or her causal judg- 
ments. 

Many empirical questions regarding auditors’ 
causal judgments are suggested by the above 
review and discussion. For example, how do au- 
ditors acquire and represent their causal knowl- 
edge? How are auditors’ causal judgments af- 
fected by variables pertaining to the audit task 
and context? What is the effect of experience 
and prior knowledge on auditors’ causal judg- 
ments? How does auditors’ causal knowledge af- 
fect their subjective probability judgments? 
How do auditors process information when 
making causal judgments? 

The experiments reported below focused on 
the association between the direction of infer- 
ence and auditors’ processing of conditionality 
and multiplicity information. When the audit 
task demands a forward inference from X to Y,a 
schema for the causal relation between X and Y 
is activated and partially instantiated with the 
observation or assumption that X is present. An 
auditor’s judgment about whether X will cause Y 
in this situation is expected to depend on infor- 
mation regarding the presence or absence of 
conditions that must conjoin with X to produce 
Y. That is, a significant weight will be placed on 
conditionality information.® 


Hı: When making ‘forward causal inferences, auditors 


place a significant weight on information about the con- 


ditionality of causality. 


Alternatively, when the audit task demands a 


backward inference from Y to X, the relevant ` 


causal schema is partially instantiated with the 


observation or assumption that Y is.present. An 


auditor’s judgment about whether Y was caused 


by X in this situation is expected to depend on | 


information regarding: (1) the presence or ab- 


sence of X, and (2) the presence or absence of al-. 
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ternative causes as well as the strength of their. 
relations with Y. That is, a significant weight will 
be placed on multiplicity information. 
H,. When making backward causal inferences, auditors 
place a significant weight on information about the muiti- 
plicity of causality. 
These hypotheses were tested in experiments 1 
and 2 reported below. 


EXPERIMENT 1 


Subjects 
Subjects were 56 senior auditors from a Big 
Eight public accounting firm, having a mean (s) 


-of 36.9 (5.8) months of auditing experience and 


having worked on a mean (s) of 23.6 (9.6) 
audits. All subjects were in attendance at a pro- 
fessional training seminar. 


Design 

There were three between-subjects indepen- 
dent variables. One independent variable con- 
cerned the direction of inference. A forward 
(backward) inference about a target cause and. 
effect was made by 27 (29) subjects. The other 
independent variables concerned multiplicity 
and conditionality information. As to multiplic- 
ity, 27 (29) subjects were given information in- 


dicating that there was a relatively strong 


(weak) alternative cause. As to conditionality, 
31 (25) subjects were given information indicat- 
ing that conditions enhancing (offsetting) the 
target cause were present. 


Procedure, setting and task 

The procedure, setting, and task for the for- 
ward inference group are described first, fol- 
lowed by a description of those for the backward 
inference group. — 


Forward inference group. Subjects’ were 
given a questionnaire and two sealed envelopes 


®This is not to say that a prediction about the occurrence of Y will be unaffected by multiplicity information. On the contrary, 
as the number of alternative causes known to be present increases, P(Y) will approach one. But, the focus in this study is on 
inferences from X to Y and not on predictions of Y regardless of its cause. 
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marked A and B. They were told that the study 
concerned causal judgments in a simplified audit 
setting where limited information is available. 

The setting involved a continuing engage- 
ment with a hypothetical company, Thompson 
Distributors, Inc. Thompson was a relatively 
small, wholesale distributor of hardware pro- 
ducts. Its management and accounting person- 
nel were of average competence and integrity. 
The company had recently experienced good, 
though not exceptional, growth and profitabil- 
` ity. The focus was the year-end balance of gross 
. Accounts Receivable. Direct test procedures 
had not yet been performed, and there was un- 
certainty about whether Accounts Receivable 
was correctly stated. A study and evaluation of 
internal control for the sales and collections 
cycle had been performed, and the following 
episode surfaced: 


The bookkeeper responsible for the sales journal quit her 
Job about two months before year end to resume her for- 
mal education. She had carried out her responsibilities at 
Thompson in a very conscientious and competent man- 
ner for several years. Unfortunately, her replacement was 
a disaster. It soon became apparent that she was in- 


adequately trained, careless, and very inefficient. By year 
end, things were at the point where the controller had to 
fire her. ; 


Subjects were told that they would be pro- 
vided with additional information from the 
study and evaluation of internal control, but first 
they had to judge the strength of the causal rela- 
‘tion between the change in bookkeepers (target 
_ cause) and error in Accounts Receivable (effect) 
without this information. They were asked “How 
likely it is that the change in bookkeepers would 
cause a material error in Thompson’s Accounts 
Receivable?” The response scale ranged from 0 
_ to 100, in increments of ten, and had the follow- 
ing anchors: “Certain that change in bookkeep- 
ers would not cause error” (0) and “Certain that 
change in bookkeepers would cause error” 
(100). They also were asked “How much confi- 
dence do you have in the judgment that you gave 


for (the causal strength question)?” The res-' 


ponse scale ranged from zero to ten, in incre- 
ments of one, with the following anchors: “Not 
any confidence at all” (0) and “Complete confi- 
dence” (10). 
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There were two types of available information 
about internal control: (1) information about 
conditions that enhance or offset any risk of 
error in Accounts Receivable due to the change 
in bookkeepers, and (2) information about alter- 
native causes of a material error in Accounts Re- 
ceivable, Le. causes not directly tied to the 
change in bookkeepers. For expositional pur- 
poses, the first type is referred to below as condi- 
tionality information, and the second type is re- 
ferred to as multiplicity information. (These 
terms were not used in the questionnaire.) The 
order of listing the information types was varied 
over subjects. 

Subjects were asked which information type 
was more relevant when judging the causal rela- 
tion described above, and responded by allocat- 
ing 100 points between the types according to 
causal relevance. After these responses, subjects 
were told that the sealed envelopes contained 
conditionality and multiplicity information. 
Whether an envelope marked A or B contained 
conditionality or multiplicity information was 


` varied over subjects. However, for ease of expos- 


ition, the following description is written as if an 
envelope marked A (B) always contained condi- 
tionality (multiplicity) information. Subjects 
were instructed to open the envelope marked A 
(B) and read its contents if conditionality (mul- 
tiplicity ) information was judged more relevant, 
but not to open the other envelope yet. After 
reading the contents, subjects again responded 
to the same causal strength and confidence ques- 
tions. Finally, they were instructed to open the 
other envelope and read its contents, after 
which they responded to the causal strength and 
confidence questions for a third time. 
The envelope marked A contained either of 
two levels of conditionality information: 
Enhancing conditions: there are numerous internal con- 
trol weaknesses in the accounting for sales. For example, 
sales invoices are recorded ini the order of shipment and 
not necessarily in numerical order. Also, there is no inter- 


nal verification of the inclusion of all sales invoices in the 
sales journal. 


Offsetting conditions: there are numerous internal con- 
trol strengths and no major weaknesses in the accounting 
for sales. For example, the controller accounts for all bills 
of lading and traces them to sales invoices. He also ver- 
ifies the inclusion of all sales invoices in the sales journal. 
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The envelope marked B contained either of 
two levels of multiplicity information: ; 


Weak alternative cause: sales clerks use a standard price 
list when preparing sales invoices. Several times during 
the year, the price list was revised duc to moderate in- 
creases in product prices. On one occasion, there was 
some delay in getting the updated list to the sales clerks. 


Strong alternative cause: the shipping office is understaf- 
fed and always slow in processing shipping documents. 


By year end, there were numerous cases in which goods ` 


had been shipped weeks earlier but still not billed to cus- 
tomers, 


Backward inference group. The procedure, 
setting, and task for the backward inference 
group were similar to those described above, 
with several important exceptions. Like the 
other group, subjects were told about the client, 
including the change in bookkeepers, and the 
focus on Accounts Receivable. However, they 
were told that, based on analytical and other test’ 
procedures, there appeared to be a material 
error in Accounts Receivable. They were asked 
“Assuming there is a material error in 
Thompson’s Accounts Receivable, how likely is 
it that this error was caused by the change in 
bookkeepers?” The response scale had the fol- 
lowing anchors: “Certain that error was not 
caused by change in bookkeepers” (0) and “Cer- 
tain that error was caused by change in book- 
keepers” (100). They also were asked the confi- 
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dence rating question described earlier. Sub- 
jects were told that the sealed envelopes con- 
tained conditionality and multiplicity informa- 
tion. They were asked which information type 


was more relevant when judging the causal rela- . 


tion described in this paragraph, and responded 
by allocating 100 points between the types ac- 
cording to causal relevance. Subjects then pro- 
cessed the conditionality and multiplicity infor- 
mation and provided second and third responses 
to the causal strength and confidence questions. 


Results 

Preliminary analyses. Table 3 presents de- 
scriptive statistics on the causal judgments, con- 
fidence ratings and points allocated to informa- 
tion types. Overall, the initial causal judgments 
were moderately high (70.6). The subsequent 
change (70.6 to 55.0) due to processing infor- 
mation of the type judged more relevant was sig- 
nificant (t = 4.53, p = 0.001). However, the 
change (55.0 to 55.4) due to processing infor- 
mation of the type judged less relevant was insig- 
nificant (t = 0.15, p = 0.88). Overall, the initial 
confidence ratings were moderately high (7.1), 
and increased significantly from the first to third 
judgments (¢ = 2.64, p = 0.01). Overall, condi- 
tionality information was judged to be more re- 
levant than multiplicity information (t = 2.57, p 
= 0.01). 


TABLE 3. Descriptive statistics —- experiment 1* 











Forward Backward 
inference group inference group 
(n= 27) (n= 29) Overall 
Causal judgments: 
First 68.3 (20.5) 72.8(16.0) 70.6 (18.3) 
Second 55.2(30.1) 54.8(23.2) 55.0 (26.5) 
Third 59.6 (29.7) 51.4(26.4) 55.4 (28.1) 
Confidence ratings: 
First 7.3(1.9) 6.9(2.8) 7.1(2.4) 
Second 7.9(1.8) 6.3(2.3) 7.1(2.2) 
Third 7.9(1.8) 7.6(2.0) 7.8(1.9) 
Points allocated to information types: 
Conditionality 72.2(16.0) 45.4(23.8) 58.4 (24.3) 
Multiplicity 27.8(16.0) 54.6(23.8) 41.6(24.3) 





*Means with S.D. in parentheses. 
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Hypotheses tests. H, predicted that, when 
making forward causal inferences, auditors place 
a significant weight on conditionality informa- 
tion. Two tests of H, were performed, using the 
forward inference group data only. In the first 
test, the dependent variable was the change in 
causal judgments due to processing conditional- 
ity information, and the independent variable 
was the level of conditionality information. The 
mean change was an increase of 9.3 when en- 
hancing conditions were present and a decrease 
of 38.3 when offsetting conditions were present. 
‘This difference was significant (t = 5.68, p = 
0.001), which supported H,. In the second test, 
the dependent variable was the causal relevance 
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points allocated to conditionality information. 
The mean of 72.2 was significantly greater than 
zero (t = 23.4, p = 0.001), which supported H,. 
It also was significantly greater than the mean 
points of 27.8 allocated to multiplicity informa- 
tion (t = 7.21, p = 0.001). 

To supplement these ‘tests, an analysis of vàr- 
iance (ANOVA) was performed using the data 
for both groups. The dependent variable was the 
change in causal judgments, and the indepen- 
dent variables were the direction of inference 
and level of conditionality information. The re- 
sults were. significant effects for conditionality 
information and the interaction (Table 4). 
Focusing on the interaction (Fig. 1), the effect of 


TABLE 4. Analysis of variance for change in causal judgments due to conditionality information — experiment 1 























Sum ofsquares df Mean square F P 
Main effects 8185 2 4092 9.05 0.001 
A 851 1 851 1.88 0.176 
B 7353 1 7352 16.26 0,001 
Interaction (AXB) 7795 1 7795 17.24 0.001 
Explained 15,980 3 5327 11.78 0.001 
Residual 23,513 52 452 
Total 39,493, 55 718 
A = backward vs forward causal inference. 
B = conditionality information level. 
Change in 
Causal 
Judgments 
9.3 7 


x 
* 


* Enhancing Conditions 
ae 


4.2 
* Offsetting Conditions 
-38.3 
Backward Forward 
Inference Inference 


Fig. 1. Effects of conditionality information and direction of inference on causal judgments. 
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conditionality information was significant for 
the forward inference group (see above), but 
not for the backward inference group (t = 0.02, 
p= 0.98). 

Hy predicted that, when making backward 
causal inferences, auditors place a significant 
weight on multiplicity information. Two tests of 
H, were performed, using the backward infer- 
ence group data only. In the first test, the depen- 
dent variable was the change in causal judg- 
ments due to processing multiplicity informa- 
tion, and the independent variable was the level 
of multiplicity information. The mean change 
was a decrease of 7.3 (24.3) when a relatively 
weak (strong) alternative cause was present. 


193 


This difference was significant (t = 2.45, p < 
0.02), which supported H}. In the second test, 
the dependent variable was the causal relevance 
points allocated to multiplicity information. The 
mean of 54.6 was significantly gréater than zero 
(t= 12.4, p = 0.001 ), which supported H,. How- 
ever, it was not significantly greater than the 
mean points of 45.4 allocated to conditionality 
information (t = 1.05, p = 0.31). 

In addition, an ANOVA using the data for both 
groups showed (at least marginally) significant 
effects for the direction of inference, level of 
multiplicity information, and interaction (Table 
5). Focusing on the interaction (Fig. 2), the ef- 
fect of multiplicity information was significant 


TABLE 5. Analysis of variance for change in causal judgments due to multiplicity information — experiment 1 











Sum ofsquares df Mean square F p 
Main effects 5966 2 2983 11.02 0.001 
A 4865 1 4865 18.28 001 
B 1094 1 1094 4.11 0.048 
Interaction (AXB) 987 1 987 3.71 0.060 
Explained 6953 3 2318 8.71 0.001 
Residual 13,843 52 266 
Total 20,796 55 378 
A = backward vs forward causal inference. 
B = multiplicity information level. 
Change in 
Causal 
Judgesnt 
3.2 


Inference 


+ Weak Alternative Cause 


A Strong Alternative Cause 
el 


Forward 
Inference 


Fig, 2. Effects of multiplicity information and direction of inference on causal judgment. 
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for the backward inference group (see above), 
but not for the forward inference group (t = 
0.03, p = 0.98). 


Discussion 
Experiment 1 tested hypotheses regarding 
auditors’ information processing when making 


forward and backward causal inferences. The. 


causal judgments of subjects making forward 
inferences were significantly affected by condi- 
tionality information, but not by multiplicity in- 
formation. These subjects also perceived condi- 
tionality information to be of significantly grea- 
ter ex ante causal relevance than multiplicity in- 
formation. The results were analogous, albeit 
weaker, for subjects making backward infer- 


ences. Their causal judgments were significantly 


affected by multiplicity information, but not by 
conditionality information. However, these sub- 
jects did not perceive, at least ex ante, a signifi- 
cant difference in causal relevance between the 
two information types. These somewhat weaker 
results may have been due to the specific nature 
of the conditionality information (i.e. informa- 
tion about conditions that enhance or offset the 
risk of an account balance error due to the target 
cause ). The subjects may have thought that such 
information would be relevant even when mak- 
ing a backward inference, because the same con- 
ditions might enhance or offset otber causes as 
-well as the target cause. 

This experiment was designed such that the 
information type judged more relevant was al- 
ways processed first. Assuming equal access to 
either information type, this is the natural sequ- 
ence. However, this sequence may have intro- 
duced a bias toward finding a significant proces- 
sing effect for the information type judged more 
relevant. Experiment 2 was performed for repli- 
cation purposes, using a modified design which 
counterbalanced the sequence of processing 
and information type. That is, regardless of 
which information type was judged to be more 
relevant, one-half of the subjects processed con- 
ditionality information first, while the rest pro- 
cessed multiplicity information first. 
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EXPERIMENT 2 


Subjects 

The subjects were 58 senior auditors from a 
Big Eight public accounting firm, having a mean 
(s) of 39.0 (7.7) months of auditing experience 
and having worked on a mean (s) of 25.3 (10.2) 
audits. All subjects were in attendance at a pro- 
fessional training seminar. None of the subjects 
in experiment 1 participated in experiment 2. 


Design 

There were three between-subjects indepen- 
dent variables. A forward (backward) causal 
inference was made by 31 (27) subjects. As to 
multiplicity information, 29 (29) subjects were 
given information indicating that there was a re- 
latively strong (weak) alternative cause. As to 
conditionality information, 31 (27) subjects 
were given information indicating that condi- 
tions enhancing (offsetting) the target cause 
were present. 


Procedure, setting and task 

The procedure, setting, and task were the 
same as those of experiment 1, with the follow- 
ing exception. After allocating 100 points be- 
tween the two information types according to 
causal relevance, one-half of the subjects were 
instructed to open the envelope marked A, read 
its contents, and respond to the causal strength 
and confidence rating questions. Then, these 
subjects were instructed to open the envelope 
marked B, read its contents, and respond to the 
questions again. For the other one-half of the 
subjects, the order of envelopes was reversed. 


Results 
Preliminary analyses. Table 6 presents de- 
scriptive statistics on the causal judgments, con- 


fidence ratings, and points allocated to informa- 


tion types. Overall, the initial causal judgments 
were moderately high (68.4), and there was a 
significant decrease from the first to second 
judgments (t = 3.18, p = 0.002) as well as from 
the second to third judgments (¢ = 2.17, p = 
0.034). The confidence ratings were moderately 
high and stable over time. Overall, conditional- 
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TABLE 6. Descriptive statistics — experiment 2* 








Forward 
inference group 
(n=31) 
Causal judgments 
First 65.2 (21.3) 
Second 59.4(27.1) 
Third 53.9(31.4) 
Confidence ratings 
First 6.7(2.1) 
Second 7.2(2.1) 
Third 7.5(2.0) 
Points allocated to information types: 
Conditionality 68.5 (19.8) 
Multiplicity 31.5(19.8) 





*Means with S.D. in parentheses. 


ity information was judged to be more relevant 
than multiplicity information (t = 2.40, p = 
0.02). 


Hypotbeses tests. The test procedures were 
the same as those of experiment 1. Regarding H,, 
the mean change in causal judgments due to pro- 
cessing conditionality information was an in- 


crease of 3.7 when enhancing conditions were: 


present and a decrease of 35.3 when offsetting 
conditions were present (t = 7.49, p = 0.001). 
The mean causal relevance points of 68.5 were 
significantly greater than zero (t = 19.3, p = 
0.001) and significantly greater than the mean 











Backward . 
inference group 
(n= 27) Overall 
72.2 (18.2) 68.4 (20.1) 
56.7 (22.9) 58.1 (25.1) 
48.5 (27.3) 51.4(29.4) 
7.0(1.8) 6.8 (1.9) 
6.5 (2.0) 6.9 (2.1) 
6.8 (2.0) 7.2(2.0) 
45.2(23.4) 57.7 (24.4) 
54.8 (23.4) 42.3 (24.4) 





points of 31.5 allocated to multiplicity informa- 
tion (t = 5.23, p = 0.001). These results sup- 
ported H. 

An ANOVA using the data for both groups 
showed significant effects for the direction of in- 
ference, level of conditionality information, and 
interaction (Table 7 and Fig. 3). The effect of. 
conditionality information was significant for 
the forward inference group (see above), but 
not for the backward inference group (t = 0.15, 
p = 0.88). 

Regarding H,, the mean change in causal judg- 
ments due to processing multiplicity informa- 
tion was a decrease of 10.0 (26.8) when a rela- 








TABLE 7. Analysis of variance for change in causal judgments due to conditionality information — experiment 2 





Sum ofsquares 
Main effects 8638 
A 2182 
B 6746 
Interaction (AXB) 5032 
Explained 13,670 
Residual "22,482 
Total 36,152 


df 


54 


57 





Mean square F p 

2 4319 10.38 0.001 
1 2182 5.24 0.026 
1 6746 16.20 0.001 
1 5032 12.09 0.001 
3 4557 10.95 0.001 

416 

634 





A = backward vs forward causal inference. 
B = conditionality information level. 
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3.7 
* Enhancing Conditions 


* Offsetting Conditions 
-35.3 f 


Forward 
Inference 


Fig. 3. Effects of conditionality information and direction of inference on change in causal judgment. 


tively weak (strong) alternative cause was pre- 
sent (¢ = 2.02, p = 0.06). This marginally sup- 
ported H,. The mean causal relevance points of 
54.8 allocated to multiplicity information were 
significantly greater than zero (¢ = 12.2, p = 
0.001), which supported H,. However, it was 
not significantly greater than the mean points of 
45.2 allocated to multiplicity information (¢ = 
1.07, p = 0.30). 

An ANOVA using the data for both groups 
showed significant effects for the direction of in- 
ference (Table 8 and Fig. 4). The effect of multi- 
plicity information was marginally significant for 


the backward inference group, but not signifi- 
cant for the backward inference group (t= 0.73, 
p= 0.47). 


Discussion 

The results of experiment 2 in general were 
quite similar to those of experiment 1. Besides 
providing additional support for the conclusions 
stated earlier, these tests indicated that the con- 
founding of the information types’ judged causal 
relevance and sequence of processing of experi- 
ment 1 was not consequential for hypotheses 
testing purposes. 


TABLE 8. Analysis of variance for change in causal judgments due to multiplicity information — experiment 2 











Sum ofsquares df Mean square F F 
Main effects 8667 2 4334 11.84 0.001 
A 8077 1 8077 22.07 0.001 
B 448 1 448 1.22 0.273 
Interaction (AXB) 1588 1588 4.34 0.042 
Explained 10,255 3 3418 9.34 0.001 
Residual 19,768 54 366 
Total 30,023 57 527 





A = backward vs forward causal inference. 
B = multiplicity information level. 
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Weak Alternative Cause 
3.1 


Forward 
Inference 


Fig, 4. Effects of multiplicity information and direction of inference on change in causal judgment. ' 


CONCLUDING REMARKS 


Despite the large literature on audit judgment 
and decision-making, the role of judged causal 
relations has been relatively neglected. On the 
assumption that auditors’ causal schemata affect 
their explanations and predictions of audit 
events, such neglect is regrettable. This paper 
has provided a review of the philosophical and 
psychological literatures on causality as well as 
preliminary evidence on the association be- 
tween the direction of causal inference and au- 
ditors’ information processing. The empirical re- 
sults were generally supportive of the conclu- 


sion that auditors place more weight on condi-: 


tionality information when making forward 


causal inferences but more weight on multiplic- 
ity information when making backward causal 
inferences, Prior to generalizing this conclusion, 
however, it is important for future research to 


conduct replications and extensions. Other im- 


portant issues for future research include how 
contextual variables affect auditors’ causal judg- 
ments and how auditors’ causal reasoning affect 
their subjective probability judgments. In pursu- 
ing the latter issue, it is recommended that re- 
searchers go beyond the view that causal reason- 
ing may lead to biases. Causal knowledge and 
reasoning are powerful cognitive tools for au- 
ditors in explanation and prediction tasks. The 
function and value of these tools should receive 
at least as much attention as their costs do. 
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THE EMERGENCE, ROLES AND CONSEQUENCES OF AN 
ACCOUNTING-—INDUSTRIAL RELATIONS INTERACTION* 


4 


_ PHILIP D. BOUGEN 
University of Leeds and Rice University, Texas 


* Abstract 


This paper reports the findings of one particular company’s experiences with a managerial strategy to 
interweave the concepts and techniques of accounting with the structures and processes of industrial 
relations, In particular, the paper argues that the emergence, roles and consequences of accounting systems 
_ can be best understood in the context of the local social situations in which they operate. a 


In 1921 the Hans Renold company of Manches- 
ter, U.K., injected accounting numbers and prin- 
ciples into the heart of its industrial relations 
programme. This interweaving of accounting 
and management—labour relations was not the 
product of chance. It was a key component 
of a strategic managerial initiative to improve 
both industrial relations and other wider aspects 
of corporate performance, with accounting 
operating in a number of distinct though over- 
lapping organisational arenas. This study 
explores the origins, objectives and conse- 
quences of this merging of accounting and 
industrial relations. In doing so, it addresses a 
number of issues which are of specific impor- 
tance for the accounting—industrial relations 
interface and which also cast light on more 
fundamental accounting concerns. f 

How and why particular forms of accounting 
emerge both at an organisational (Banbury & 
Nahapiet, 1979; Boland, 1979; Cooper, 1981; 
Hopwood, 1987) and at a societal level (Merino 
& Neimark, 1982; Burchell et al., 1985; Hoskin 


'& Macve, 1986; Miller, 1986) cannot be consi- | 


dered in a contextual vacuum. The recognition 
that the emergence of accounting systems is 
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located within complex networks of economic, 
social and political considerations constitutes a 
backdrop of great significance for those in- 
terested in a better understanding of accounting. 
The analysis of how and why accounting systems 
can emerge at certain conjunctures to penetrate 
the very fabric of social and organisational life 
can assist in the development of a greater aware- 
ness of the nature and diversity of the function- 
ing of accounting. 

Recognition of the significance of the contex- 
tual locations of accounting led to a series of per- 
suasive pleas (Colville, 1981; Hopwood, 1983; 
Tomkins & Groves, 1983; Otley, 1984; Scapens, 
1984; Roberts & Scapens, 1985) for the explora- 
tion of “accounting in action” (Hopwood, 
1978). However, the argument that the actual 
functioning of accounting could be best 
understood with reference to the specific arenas 
in which it operated, constituted more than an 
advocation for uncritical contextual analysis. 
Often central to the reasoning was the challenge 
to researchers to unshackle their work from pre- 
conceptions as to the underlying rationale of the 
relationships between accounting and the so- 
cial. The very purposes of accounting were to be | 


*I am grateful'to Anthony Hopwood, Peter Miller and an anonymous referee for their comments on an earlier draft of this 
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subjected to scrutiny (Burchell et al., 1980), 
with accounting being interrogated as regards 
its potential to fulfil its often alleged role as a 
neutral information source (Tinker, 1980; 
Tinker et al., 1982; Cooper & Sherer, 1984; 
O'Leary, 1985). As these lines of enquiry were 
pursued (Rosenberg et al., 1982; Boland & 
Pondy, 1983; Berry et al., 1985; Roberts & 
_ Scapens, 1985; Preston, 1986; Hopwood, 1987) 
accounting came to be viewed increasingly as a 
highly differentiated craft serving a variety of 
roles in a variety of contexts. No longer was it 
tenable to assume accounting to have “... some 
essential role or function” (Burchell et al., 1985, 
p. 409). 


As the richness and complexities of account- 


ing contexts are more fully investigated, issues . 


which have so far had only a marginal impact 
upon the content and direction of research 
endeavours will become more central to future 
enquiries. The emergence of specific forms of 
accounting in specific social contexts, their very 
“appearance, functioning, and perhaps ultimate 
disappearance warrants increased examination. 
_ Such a dynamic view of the fluidity of account- 
ing challenges the myth of a generality of both 
accounting purpose and social receptivity, re- 
quiring greater attention being paid in the future 
to the specification and definition of which 
particular aspects of accounting systems come 
to be intertwined with particular social arenas. 
. The origins and stimuli which encourage the 
emergence of accounting in the social will be- 
come areas of discussion and debate (Burchell et 
al., 1985). The facilitative and enabling proper- 
ties of accounting (Hopwood, 1985) will com- 
pel researchers to consider how and why it 
might be appropriated, mobilised and strategi- 
cally coupled to particular priorities for partisan 
and opportunistic purposes. As the linkages and 
dependencies which are forged between ac- 
counting and the social are explored, insights 
are possible into some of the expectations and 
perceptions of accounting which guide its inter- 
vention into the social. Such studies offer fas- 
cinating possibilities for an investigation of the 
attributes and qualities conferred on account- 


ing. 
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However, if accounting is to be strategically 
coupled with a variety of objectives, structures 
and processes it is important that its actual per- 
formance in such arenas be monitored and ap- 
praised. Whilst peoples’ perceptions of the 
facilitative capacity of accounting might offer in- 
sight into the potential scope and parameters of 
its operation, of equal significance is the actual 
success or failure of accounting to fulfil such. 
tasks. The extent to which accounting can offer 
a sufficiently robust conceptual and calculative 
framework for the achievement of various goals ` 
would be of interest, as would be evidence of 
situations where the expectations of accounting 
were to provide inappropriate, necessitating a, 
reappraisal and reformulation of the accounting 
mission. Clearly to capture such processes, 
research of both a sufficient depth and over a 
suitably extended time scale is essential. Such a. 
detailed type of analysis would have additional 
attractions. Since the functioning of accounting 
within specific contexts is likely to prove 
neither neutral nor inconsequential, what sub- 
sequently results from the interweaving of ac- 
counting and the social is important. As account- 
ing systems penetrate various social arenas an 
exploration of their tangible effects upon those 
who come into contact with them would be pos- 
sible. Equally significant would be the attitudes 
of the participants towards accounting: whether, 
for example, as a response to the system’s opera- 
tion these were to change. Although rarely in- 
vestigated, the extent of actual collaboration 
with or resistance to accounting processes is of 
immense importance for any serious evaluation 
of the performance of accounting systems. 
within the social. 


ACCOUNTING AND INDUSTRIAL RELATIONS 


The specific focus of attention of this study is 
the interaction between accounting and indust- 
rial relations, an illustration of the merging of 
accounting and the social of substantial interest. 
If the emergence, roles and consequences of ac- 
counting interventions into the social are to be 
subjected to the type of detailed scrutiny and as- 
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sessment suggested above, then this arena will 
prove to be particularly fertile for analysis. 
Industrial relations as a general categorisation of 
organisational considerations is highly diverse in 
its scope. Taking the relationship between man- 
agement and labour as a central theme, it can 
embrace all organisational (and indeed environ- 
mental) actions, structures and processes which 
impinge upon this interface. It can, therefore, in- 
-corporate, on the one hand, areas perhaps sus- 
ceptible to calculation and numeration such as 
“wage payment levels, work methods and 
employee output and, on the other hand, issues 
more ambiguous in formulation, such as the ex- 
‘tent of management-—labour trust, the basis of 
managerial prerogatives and employee com- 
. pliance with managerial directives. 

Whilst one might acknowledge the possibility 
of the chance emergence of accounting in the in- 
dustrial relations field, the strategic managerial 

_injection of accounting, with its emphasis on the 
calculative definition, recording and measure- 
ment of organisational activities, into an arena 
often characterised by unclear and potentially 
unstable relationships is not unproblematic. The 
priorities and expectations which stimulate a 
particular managerial policy to forge specific 
couplings between certain accounting and in- 
dustrial relations considerations demands both a 
conceptual orientation and a research style sen- 
sitive to the peculiarities of such situations. A 
refusal to accept either the efficacy of such cou- 
plings as being preordained or the role for the as- 
sociated accounting systems being similarly pre- 
dictable would appear essential, as indeed 
would be a willingness by researchers to explore 
more actively the specific local contexts of such 
unions. : 

There has been, unfortunately, a lamentable 
paucity of such research in the accounting—in- 
dustrial relations field. Too often, issues deemed 
central to the understanding of the area have 

` been the product of speculation rather than the 
analysis of specific empirical illustrations. As a 
consequence, a style of research has emerged 
more concerned with the potentialities and as- 
sumed purposes of accounting than its ac- 
tualities. The portrayal of accounting as merely a 
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calculative and neutral information source 
(Climo, 1976; Palmer, 1977; Lau & Nelson, 
1981; Pope & Peel, 1981 ) with it unproblemati- 
cally residing in the industrial relations arena 
characterised many such endeavours. Even 
studies which displayed some sensitivity either 
to the possibility of accounting playing a variety 
of roles (Craft, 1981; Maunders & Foley, 1984) ` 
or to the tenuous basis of the appropriateness of 
an accounting—industrial relations merger 
(Batstone, 1979; Ogden & Bougen, 1985; Owen 
& Lloyd, 1985) still detached their analysis from 
the richness and complexity of an actual contex- 
tual. location (an exception being Amernic, 
1985). Su os 
This neglect of specific contexts has had im- 
plications beyond a tendency towards unrealis- 
tic speculation. Closely related to it has been the 
willingness to conceptualise the emergence of 


„an accounting—industrial relations union in 


terms of general and often environmentally de- 
termined trends (Foley & Maunders, 1977; 
Purdy, 1981; Hussey & Marsh, 1983; Jackson- 
Cox et al., 1984) without ever specifying the 
processes either by which such external pres- 
sures are identified or become filtered down 
into particular organisational contexts. The 
search, therefore, for general explanatory vari- 
ables has been undertaken again at the expense 
of the analysis of local and idiosyncratic com- 
pany responses offering the opportunity of un- 
derstanding how even the general trends can 
stimulate a diversity of accounting and industrial 
relations mergers. Moreover, research which 
has focused on specific organisational instances 


_of the actual merging of accounting and indust- 


rial relations (Mitchell et al., 1980; Moore & 
Levie, 1981; Reeves & McGovern, 1981) has 
often adopted too short a time horizon to offer 
any overall picture of either the expectations 
and pressures which preceded such action or 
the process of any reformulation and subsequent 
change of the bases of its merging. This 
shortcoming has also had ramifications for the 
seeking out of the real consequences and effects 
on participants of the merging of accounting and 
industrial relations. Whilst studies exist which 
have sought to explore the significance of this 
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issue for a particular industry (Bryer et al.,. 


1982) and for more aggregated levels of national 
economic activity (Harte and Owen, 1987), 
remarkably little evidence is available at the 
organisational level. Such an omission would ap- 
pear particularly serious since the industrial re- 
lations area itself, incorporating such considera- 


tions as productivity, remuneration and work- 


place discipline, have their most immediate and 
tangible impact on the individual organisational 
participant. 

To conclude, it would seem that there is a very 
real need to trace and to evaluate the emer- 
gence, roles and consequences of the interweav- 
ing of accounting and industrial relations in one 

_ particular organisation over a substantial period 
of time. This study attempts to take advantage of 
the opportunity to do so. 


THE HANS RENOLD COMPANY 


The research site 

The site used in this study for the exploration 
and appraisal of the merging of accounting and 
industrial relations is the Hans Renold Company 
of Manchester, U.K. A sufficient quantity and var- 
iety of research material exists to reassemble 
some of the expectations and pressures which 
preceded the initiative and to trace its sub- 
sequent performance over a substantial period 
of time. Three sources of research material have 
been employed in the analysis. Firstly, archival 
material in the form of the recorded minutes of 
a series of management—employee meetings was 
used. These meetings were explicitly created by 
management as forums for the joint discussion of 
accounting—industrial relations considerations. 
They constitute an active dialogue, helping to 
expose some of the assumptions and percep- 
tions of the major participants and to make visi- 
ble the process by which they developed and 
evolved. Secondly, a central participant in the 
events described in this study was C. G. Renold, 
the son of the Company’s founder, and an emi- 
nent and prolific writer on the management of 
enterprises. His writings provide an additional 
_ insight into some of the beliefs and expectations 
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which stimulated the intertwining of accounting 
and industrial relations (Renold, 1917, 1921, 
1927, 1928, 1920a, b, 1950). Thirdly, the Com- 
pany was considered in many respects to be a 
centre for the development and implementation 
of innovative work methods and enlightened 
managerial practices. It has attracted, therefore, 
a considerable interest in the history of its ad- 
ministrative experiences. This literature offers 
some further evidence of what was to transpire 
(Urwick & Brech, 1946; Tripp, 1956). 


The Company 1879-1914: accounting, 
scientific management and employee welfare 
The focus of attention for the study is a profit- 
sharing scheme which was formally introduced 
into the company in 1921 and which operated 
for a 10-year period. The scheme was conceived 
and designed by management as a vehicle for the 
interweaving of accounting and industrial rela- 
tions. However, an appreciation of this particu- 
lar merging of accounting and industrial-rela- 
tions demands a consideration of an historical 
configuration of variables and influences which 
preceded it and the ways in which they became 
enmeshed with a more immediate set of pres- 


sures expediting managerial action. It will be 


argued that the managerial decision to introduce 
what will be shown to be a rather unusual type of 
scheme and to do so at the particular con- 
juncture, had roots apparent at the start of the . 
Company’s existence. This is not to imply that’ 
one can discern a continuity of intent with the 
profit-sharing scheme being merely the catalyst 
of an inevitable series of demands and events. 
However, an organisational infrastructure and 
culture had emerged which influenced the deci- 
sion to introduce the scheme, helped to shape its 
content, and facilitated its acceptance and de- 
velopment. 

The Company was founded in 1879 by Hans 
Renold, a Swiss engineer. In 1880 he invented 
the bushroller chain for the rapidly expanding 
bicycle market, an achievemeent of major en- 
gineering significance (Tripp, 1956, p. 31). Up 
to the outbreak of World War 1 the commercial 
and financial achievements of the Company 
were substantial. After its formation as a private 
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limited Company in 1903 the financial results 
had continually improved. By 1907, sales were 
£100,000 per annum, having doubled from their 
1903 level, share capital had increased from 
£100,000 to £200,000 (and was subsequently 
to rise to £260,000 by 1913) and the Company 
was sufficiently confident to have invested 
£30,000 in a new factory and £20,000 on plant 
and machinery (Tripp,.1956). 

The structural and behavioural relationships 
‘within the factory which underpinned this 
- success were very much influenced by Hans 

Renolds own personality and personal predis- 
positions. More significantly, they represented a 
crude and idiosyncratic, though entirely prag- 
matic, blending of accounting and industrial re- 
lations which was to prove of immense signifi- 
cance for their later more systematic coupling. 

Hans Renold sought to create an organisa- 

tional culture and operational infrastructure 
congruent with his own beliefs about the aims 
and preferred methods of business administra- 
tion. This consisted of a combination of benevo- 
lent paternalism and the search for organisa- 
tional efficiency. Hans Renold was first and 
foremost a trained and highly competent 


mechanical engineer to whom quality workman-- 


ship was always critical (Renold, 1950; Tripp, 
1956). This he coupled with a strong religious 
and humanitarian interest in the welfare of his 
- employees where “nothing that contributed to 
good work was too good for his workpeople” 
(Renold, 1950, p. 13). In 1896 the Company 
introduced a 48 hour working week when the 
industry norm was at least 52 hours and paid 
wages which exceeded those of the district en- 
‘gineering rate (Renold, 1950; Tripp, 1956). A 
well lit and ventilated factory was built in 1906 
‘and health, educational and recreational 


facilities were subsequently provided. Although ` 


Hans Renold argued “Our job is not to. make 
chains. It is to make men and women; they will 
make chains for us” (Tripp, 1956, p. 25), he also 
took great care to establish Company structures 
and processes which would facilitate their effi- 
cient production. 


Firstly, the Company exhibited an unusually 
early commitment to the principle of scientific, 
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management and the search for a well discip- 
lined fiow of work. Hans Renold had met F. W. 
Taylor and had been impressed with his first- 
hand observations of- Taylorism in practice, 
(Urwick & Brech, 1946, p. 167), later becoming 
a founding international member of the F. W. 
Taylor Co-Operators (Drury, 1918, p. 132). 
Whilst: he viewed scientific management as 
“neither more nor less than common sense” 
(Tripp, 1956, p. 30), Hans Renold, along with his’ 
son C. G. Renold, created “a generation ahead of 
its time, ... a truly outstanding illustration of . 
‘British scientific management in practice” 
(Urwick & Brech, 1946, p. 169). Standardised in- 
.formation flows, organisation charts and merit 
rating all bad been established prior to World 
War 1 (Urwick & Brech, 1946; Renold, 1950; 
Tripp, 1956): K 
Secondly, a comprehensive cost accounting 
system was employed in the factory. At his 
farewell speech as Chairman in 1928, Hans 
Renold described how he commenced the busi- 
ness with one machine man and his boy, and a 
bookkeeper” (Tripp, 1956, p. 125). It seems sig- 
nificant that at the very start of the Company’s 
life, when faced by a whole multitude of produc- 
. tion, marketing and finance problems, the keep- 
ing of accounting records was considered to be 
of prime importance. In 1900 a “scheme of cost- 
,ing” was introduced which “would be consi- 
dered modern even today” (Renold, 1950, p. 13): 
and by 1914 a cost manager had been employed 
: (Tripp, 1956, p. 162). This early commitment to 
cost accounting was untypical of much of British 
manufacturing industry which introduced cost- 
ing systems as a direct consequence of Govern- 
ment intervention’ during World War 1 
(Armstrong, 1985; Loft, 1986). 
This coupling of paternal benevolence, scien- 
tific management and accounting is of significant 
‚interest. As suggested above, it represented an 
idiosyncratic and improvised blending of a naum- 
ber of broad areas of influence on the early years 
of the Company’s operations. The strategic ob- 
jective underlying this network of practices 
seemed to be the search for efficiency coupled 
‘to an interest in employee welfare. Whilst it has 
been argued (Child, 1969; Fox, 1985) that this 
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type of managerial paternalism was often accom- 
panied by an attention to the role of labour in the 
production process, its coupling with scientific. 
management and hence increased control over 
labour had been criticised by employers (Cad- 
bury, 1914) sharing similar humanitarian ideals 
to Hans Renold. Similarly, whilst the quest for 
engineering excellence which permeated the 
Company provided an underlying commitment 
to an ideology of measurement and quantifica- 
tion, the dual operation of scientific manage- 
ment and cost accounting systems was not 
necessarily mutually reinforcing (Epstein, 1978; 
Wells, 1978). The ambiguity and imprecision of 
accounting could be viewed as being inconsis- 
tent with the scientific management aim for 
exact measurement. 

However, for the Hans Renold Company this 
combination of interests and priorities seemed 
to coexist with a reasonable degree of compati-' 
bility. More significantly, they constituted an op- 
erational infrastructure and corporate culture 
which was to provide a receptive and facilitative 
organisational framework for the subsequent. 
and more premeditated injection of accounting 

into the industrial relations arena. It will be; 
argued that as a number of external events and 
influences became entangled with an equally 

complex set of internal objectives and require- 

ments, C.G. Renold recognised the additional . 
potentialities available from the intertwining of: 
accounting, industrial relations and indeed sci- 
entific management. 


The impact of the 1914-18 War upon 
industrial relations in the company 

The period 1910—20 was one of serious social 
and political unrest in Britain, with it being de- 
scribed as “a climax of class-conscious self 
activity among the workers which in Britain has 
not yet been surpassed” (Hinton, 1973, p. 13). 
These struggles were most manifest in the con- - 
duct of industrial relations (Hinton, 1973; 
White, 1975; Clegg, 1985) and intensified dur-' 
ing the War, leading to a marked increase in’ 
socialist organisations and an upsurge in trade 
union growth and militancy (Hammond, 1919; 
Cole, 1923). Central to this was the Ministry of | 
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Munitions Act 1915 by which the government 
sought to exert increased regulation over the 
general labour market and workshop practices 
and discipline of factories employed on war 
work. The workforces in these “controlled es- 
tablishments” were subjected to government 
control in such areas as enforced mobility, the 
suspension of restrictive practices, imposed 
changes in work method and the dilution of skil- 
led by non-skilled labour. Despite government 
legislation to make strikes illegal, strikes did 
occur (Hinton, 1973, p. 37), with the Manches- 
ter engineering industry being at the centre of 
the most bitter disputes. The Hans Renold Com- 
pany became a controlled establishment (Tripp, 
1956, p.104) and whilst at the heart of this geo- 
graphical location lost no production. — 

The Company’s success in maintaining war- 
time production seemed to owe a great deal toa 
combination of scientific management practices 
and a piece-rate wage payment scheme whereby 
“no limit was set to earnings” (Ministry of Muni- 
tions, 1922, VoL 4, Part 1, pp. 33—34). The profit 
potential available from munitions contracts was 
sufficient for the Company to have to repay to 
the government in 1917 monies liable under the 
Munitions Levy and Excess Profits Duty (Profit- 
Sharing Committee Minutes, 6.9.20). In the 
same year, employees in the Company raised the 
possibility of some form of profit-sharing being 
introduced but had it rejected on the basis that 
“the Works economy was distorted by muni- 
tions contracts” (Renold, 1950, p.26). The Com- 
pany’s judicious use of this profit potential via 
the piece-rate scheme, whilst allowing it to 
avoid the excesses of industrial relations strife, 
was not however sufficient to preempt some 
problems from appearing. i 

The major impact of the War upon the Com- 
pany was its disruption of the communal spirit 


-and behavioural continuity which had been nur- 


tured by the Renolds, who had consistently 
sought a personal proximity with their work- 
force. During the War the number of employees 
at the Company rose from 1200 to 2300 with the 
result that: 


“the place was full of strangers, many of them newcomers 


= 
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to the industry. The old contact between management 
and employees was disappearing and mutual confidence 
was slipping” (Renold, 1950, pp.16-17).__ 


By the end of 1916 the situation had become so 
severe that: 7 


“if a general breakdown of works morale was to be- 


avoided some means of talking things over between Man- 
agement and Workers was essential” (Renold, 1950, 
p.18). 


During the early months of 1917 when the 
Manchester engineering industry was at the 
centre of a seething undercurrent of labour 
discontent (culminating in strikes involving 
200,000 engineering workers, the largest of the 
War, Hinton, 1973, p. 196) the Hans Renold, 
company initiated a formal management—labour 
dialogue. Using directives from the Ministry of 
Munitions to establish Joint Accident Prevention 
Committees as “a pretext and an occasion” (Re- 
nold, 1950, p. 19) the Company set up a Welfare 
Committee. Structured along non-trade union 
lines, with employee representatives to be nomi- 
nated by management, the Welfare Committee 
was designed: 


“to consider questions affecting conditions under which 
the work of the place Is carried on” (Renold, 1950, p.19). 


Its role as envisaged by management was to act 
as a joint consultative arena for the manage- 
ment—labour discussions and the improvement 
of general working conditions in the factory. The 
initiative created problems however, since: 


* “No sooner had the provisional committee got to work 
than Management received a notice from hitherto un- 
known individuals to the effect that:- l 
A Committee of Stewards had been established in the 
shops at Renolds Works by members of the Amalgamated 
Society of Engineers for looking after their interests as 
Trade Unionists” (Renold, 1950, p.19). ` 


Whereas the Welfare Committee was intended 
by management as an arena structured on the 
“basis of common interest in the prosecution of 
a common enterprise”, the Shop Stewards Com- 
mittee was seen by the trade unions as a vehicle 


for collective bargaining and negotiation with 
“divergent interests to be reconciled” (Renold, 
1950, p.111). The eventual demise of the 
Welfare Committee in 1920 was due to a combi- 
nation of constitutional irrelevance and labour 


indifference. The Shop Stewards Committee 


“made it more glaringly patent that (its) func- 


‘tions were not vital” (Renold, 1950, p.109) 


whilst “its explanations did not get across, or 
have any real effect on the feeling or understand- 
ing of the workers as a whole” (Renold, 1950, 
p.23). 

` In 1919 the Company made a net loss of 
£3000 on the year’s trading (Tripp, 1956, 
p.119), the first such occurrence in its history, as 
it sought to respond to new products, work 
methods, greater competition, and the redirec- 
tion of factory output away from munitions 
contracts. Moreover, the Company’s traditional 
industrial relations policy of seeking a close and 
paternal relationship with its employers had al- 
ready been eroded by the effects of the War. In- 
deed the attempt via the Welfare Committee to 
establish a more formal arena for the develop- 
ment of a management—labour dialogue was on 
the verge of collapse (Renold, 1950). It was 
under these circumstances that C. G. Renold 
began to consider the concept of profit-sharing 
as a sympathetic point of convergence for a 
number of corporate priorities and interests. 
Moreover, C. G. Renold also believed that the 
Company’s experiments and continued interest 
in accounting and scientific management might 
also be harnessed for such a project. The 
strategic injection of accounting and scientific 
management into these arenas was to be based 
upon their potentiality to offer management 
contributions which exceeded the conventional 
parameters of the routine recording and control 
of operational activities. The relaunch of man- 
agement—labour consultation and the improve- 
ment of industrial relations were now to come 
within their sphere of influence. 


PROFIT-SHARING AND THE RENOLD 
PHILOSOPHY OF INDUSTRIAL RELATIONS 


During the 1914-18 War and in the im- 
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‘mediate post-War period discussions of profit- 
sharing were by no means rare amongst those 
concerned with the resolution of industrial 
problems (Macara, 1921; Bowie, 1922; Gilchrist, 


1924; Sheldon, 1924). Indeed the concept hada ` 


distinguished pedigree (Taylor, 1884; Gunton, 
1888; Clayton, 1912; Watney & Little, 1912) 
with attempts having been made to trace its ori- 
gins back to feudalism and product sharing (Gil- 
man, 1889). What is of particular interest in this 
context, however, is how C. G. Renold con- 
structed an incredibly elaborate, though in oper- 
ational terms, intensely pragmatic role for profit- 
sharing. 


C. G. Renold seemed to believe that the accep- 
tance by labour of the managerial prerogative to 
determine workshop relationships could not be 
taken for granted, since “the work of very many 
men, probably of most is given more or less un- 
willingly” (Renold, 1921, p.210). In particular, it 
was in the workshop where the employee 
“meets and resents the arbitrary exercise of 
authority” (Renold, 1921, p.209) that such prob- 
lems were most manifest. It was there that “the 
workers are irritated beyond measure by the in- 
efficiency and blundering in organisation and 
` management which they detect on every side” 
(Renold, 1917, pp. 161—162). Although with the 
introduction of scientific management “effi- 
ciency, in a material sense, has been achieved” 
this had been accomplished “at the cost of plea- 
sure and interest in work: and one problem 
which faces us now is the possibility of restoring 
these to some extent as, for instance, by some 
devolution of management responsibility onto 
the workers” (Renold, 1929a, p.10). 


C. G. Renold’s concern with employee unease 
and the basis of managerial prerogatives had, 
therefore, led him to consider some “devolution 

of management”. This he envisaged as the intro- 
duction into workplace relations of various joint 
consultative bodies whereby the physical meet- 
ing of management and labour would offer man- 
-agement the opportunity to explain and perhaps 
construct the new basis for authority which con- 
cerned him, thereby facilitating a cognitive con- 
gruency between management and labour. 
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“It is, indeed, a positive duty on the part of every 
employer... to seize opportunity to discuss management 
and business problems with his workpeople... greater 
knowledge, on the part of the workers, of the actual prob- 
lems of industry is the best hope of the future” (Renold, 
1921, p.211). i 


This “greater knowledge” would consist of man- 
agement. 


“tell(ing) them what it is ali about, what the problem is, 
and how you are seeking to tackle it” (Renold, 1927, 
p.24). 


There is, therefore, underlying the attempt to 
reestablish more co-operative relationships be- 
tween management and labour, a central role for 
the wider flow of information about the prob- 
lems and performance of the company. More 
critically, C.G. Renold recognised the possibility 
of coupling profit-sharing (and its financial in- 
ducement to labour to participate in discus- 
sions) with the education of labour: 


“A generous scheme of profit-sharing would be of great 
help in this connection. One of the difficulties of giving 
the kind of information necessary under this head is that 
the worker so often does not really believe that efficiency 
in industry is his affair... A profit-sharing scheme, there- 
fore, which made successful administration a direct and 
living issue to workers no less than to management and 
shareholders, would make the giving of fuli information 
about the progress ofa business much easier. Under such 
a scheme full accounts of the manufacturing activities 
of the business could quite well be laid before a com- 
mittee of workers... Under such circumstances, business 
and management problems could be discussed with the 
very greatest freedom and with a corresponding benefit 
from an educational point of view” (C. G. Renold, 1921, 
p.233, emphasis added). 


Not only would profit-sharing, therefore, make 
employee education a more relevant exercise 
for labour but it would also facilitate the disclo- 
sure of accounting information. 

Such a line of analysis forges an almost au- 
tomatic relationship between employee educa- 
tion, accounting information and profit-sharing, 
assuming as it were not only the inevitability of 
such a linkage but also some implicit concensus 
as to the desirability of its objectives and effects. 
The notion of employee education, whilst being 
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presented as a neutral and mutually beneficial 
process, carried with it an assumption that not 
only would management be the educators but 
that the curriculum would be based on their 
priorities and perspectives. As Gilchrist (1924, 
p.228) argued: 


“they will appreciate the meaning of lean as well as fat 
years; they will, too, know more clearly the value of their 
own efforts and their place in the scheme of things.” 


Furthermore, the curriculum being based on sci- 
entific management and cost accounting was 
similarly presented in unproblematic terms: 


“Scientific management finds perhaps its culminating ex- 
pression in the work of the Financial Controller, or in the 
Cost Accounting Section of the Finance Department... 
They aim at laying down beforehand what is expected, 
based on a study of ail the relative facts...” (Renold, 
1929a, pp. 764—769, emphasis added). 


Moreover, the resort to management by “facts” 
was not something-divorced from industrial 
democracy: 


“... it may be pointed out that there is nothing incompati- 
ble between scientific management and the develop- 
ment of schemes of ‘Industrial Democracy’... Indeed the 
more management can be based on ascertained facts and 
follow predetermined procedures the more possible it is 
to take the workers into consultation” (Renold, 1929a, 
p.767). 


C. G. Renold seemed to be constructing a 
rudimentary framework for the linking of man- 
agement theory and practice. This was founded 
upon a belief that the creation of a new basis for 
managerial prerogatives could be integrated 
with his existing interest in joint consultation 
and profit-sharing and the potential these gave 
for improving management—labour relations. 
The disciplines of scientific management and 
cost accounting were to provide the intellectual 
and operational foundation for this endeavour. 
The central role in this exercise for “manage- 
ment by facts” might superficially appear little 
more than an extension of the Renold interest in 
scientific management and accounting. How- 
ever, whereas these techniques had previously 
been employed as a means of planning and 
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monitoring factory performance, it now appears 
that this resort to the acquisition of data, to quan- 
tification and to measurement was being called 
upon for different purposes. Managerial posses- 
sion of this body of knowledge elevated them to 
the role of experts with access to a source of abil- 
ity detached on the basis of its alleged scientific 
rigour from more coercive bases of managerial 
prerogatives. Moreover, the corollary to this 
argument seemed to be that it was the responsi- 
bility of management to ensure that labour was 
made aware of the nature and importance of the 
problems the company faced and of the critical 
nature of environmental and market forces for 
its survival. 

Even within the debates of the time, the 
rationale of C. G. Renold’s analysis was not 
unanimously shared. First, a relationship be- 
tween profit-sharing and scientific management 
was not obvious since as Jenks (1960, p. 436) 
notes “most of the profit-sharing schemes were 
rejected by engineers as smacking of pater- 
nalism, as remote from tangible objectives, and 
as irrelevant.” C. G. Renold although a trained 
engineer and a disciple of scientific management 
clearly saw such a relationship as being highly 
functional. Moreover, since paternalism and 
scientific management had previously coexisted 
within the Company, this was not a relationship 
which was totally alien either to management or 
to labour. Secondly, advocates of profit-sharing 
whilst promoting its use in industrial relations 
had explicitly sought to decouple it from the dis- 
closure of accounting information. Gilman’s 
(1889, p. 248) argument that, “the great major- 
ity of firms dividing profits have emphatically 
reserved to themselves full control of their 
business and their accounts”, echoed the con- 
cern of a number of commentators (Taylor, 
1884; Graham, 1921). Finally, it is interesting to 
note for subsequent events that the strategic in- 
jection of accounting numbers and principles 
into the industrial relations arena was premised 
on the assumption that it constituted a suffi- 
ciently robust technical and conceptual frame- 
work to convince labour of the rationality and 
reasonableness of managerial priorities and 
actions. 
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ESTABLISHING THE BASES OF THE 
PROFIT-SHARING SCHEME 


The managerial objectives and expectations 
underlying the introduction of the profit-sharing 
scheme become more apparent when one con- 
siders its particular, and in many ways highly un- 
usual, format. A provisional profit-sharing 
committee comprising management and 
employee representatives was established in 
June 1920 to explore for a six month period the 
basis for the scheme’s introduction. The scheme 
commenced in January 1921 and operated until 
the end of the decade. Although profit-sharing 
does not require a formal management—labour 
body to monitor its performance, the committee 
was to continue in existence throughout the life 
of the scheme. It was explicitly designed by 
management as an arena for the discussion of in- 
dustrial relations issues. The formally recorded 
minutes of these monthly committee meetings 
provide important insights into the dynamics of 
management—labour relations. More signific- 
antly, they offer a dynamic perspective on the 
motives, roles and consequences of the injection 
of accounting numbers and principles into the 
field of industrial relations. 

At the first meeting of the provisional profit- 
sharing committee C. G. Renold sought to estab- 
lish the ground rules of the scheme and the roles 
of the committee. The representatives were in- 
formed that: 


“the Committee was not asked to decide anything but to 
simply thrash out proposals” (Committee Minutes 
[benceforth C. M.], 19.7.20). ` 


Having specified that the Committee was to have 
no executive powers C. G. Renold argued that 
such a scheme could only operate by deciding: 


“What were the main interests concerned in industry... 
the general nature of the services rendered by each... and 
the payment of those services” (C_M., 19.7.20} 


Since such questions strike at the very roots 
of organisational relationships various concep- 
tualisations of the issues were available. C. G. 
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Renold’s answer was to present the scheme illus- 


trated in Fig. 1. 
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‘OUTGOINGS 
RAW MATERIALS 
WAGES AND IES 
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SURPLUS TO BE DIVIDED BETWEEN 
(1) CAPITAL 
(th) STAFF 
(iil) LABOUR 


Fig. 1. Organisational relationships and profit-sharing. 


The choice of an accounting income state- 
ment, raising as it does problems in the selec- 
tion, definition and measurement of items to be 
included, might not appear the most obvious 
framework for analysis. However, from the start, 
accounting issues have been injected into the 
arena. Moreover, there was introduced and 
categorised along the more conventional ex- 
penses an unusual item: the “wages of capital”. 
C. G. Renold proceeded to elaborate on the 
scheme’s format: 


“the income of the business depended on the quantity of 
goods sold multiplied by the price. Price, therefore, was 
an essential factor. That was the first point to be noted in 
the scheme” (C.M., 19.7.20, emphasis in original). 


As regards the wages of capital: 


“which had-been included as an expense was really no- 
thing more than a sufficient return to attract enough Cap- 
ital. .. Ifa business were to continue... it must be able to 
pay the ‘Wages of Capital’ just as it must be able to pay the 
Wages of Labour. That is to say, the ‘Wages of Capital’ 
were in the nature of an expense, not a surplus” (CM. 
19.7.20, emphasis in original). 


The selling price of the product, which paren- 
thetically one might observe was beyond the 
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control of labour, was therefore central to the 
scheme. In addition, the “wages of capital” were 
to be viewed, and indeed were to be accounted 
for, as analogous to the wages paid to labour, 
being expenses of the period rather than approp- 
riations of profit. C. G. Renold then clarified how 
the distributable profit-sharing surplus was to be 
computed: 


“The difficulty was this — although the ‘surplus’ was the 
difference between Income and Expenses (or Prices and 
Costs) and the Income was the result of sales, the income 
of a short period, such as a month, could not truly be set 
against the expenses of that month. Production’ and 
‘Sales’ could not, in the nature of things, be in step, and 
some way had to be found of measuring the activities of 
each section of the organisation by itself, namely, a price 
for its work or service... Since ‘selling price’ was the basis 
of the scheme these sectional prices must be divisions of 
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the ‘selling price’; that is to say, the ‘selling price’ must be 
divided up into parts attributable to the various sections 
of the business or groups of expenses, thus giving a 
standard cost for cach” (C.M., 19.7.20, emphasis in orig- 
inal). 


The diagram used to describe the key relation- 
ships is illustrated in Fig. 2. 

Once again, accounting classifications were 
given a high profile in the scheme’s underlying 
structures, with Fig. 2 making it clear that there 
was to be a coupling of accounting controls and 
profit-sharing. This disaggregation of the Com- 
pany into “sections” and the introduction of 
standard costs ensured that the operational 
mechanics of the scheme were intertwined with 
the planning and appraisal of sub-unit perform- 
ance. However, the control system, by being 
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(Committee Minutes, 19.7.20.) 
emphaals in original 


Fig. 2. The calculation of a profit-sharing surplus. 
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based on the allocation of the portion of the final 
selling price of the product, introduced an ele- 
ment of uncertainty and incorrigibility which 
was avoidable. The centrality of this variable, 
which production departments could not con- 
trol and where the physically non-existent unit 
of the wages of capital was to receive its entitled 
portion, entagled the profit-sharing scheme in a 
complex web of arbitrary accounting alloca- 
. tions. ; 
Despite the technical ambiguities of the 
scheme’s foundation it was described by C. G. 
Renold as “not guesswork but founded on facts 
and figures” (C.M., 4.8.20, emphasis added) 
whereby actual costs could be compared against 
standard and where “if the former were less than 
the latter a surplus would be manifest, a definite 
proportion of which would be divided” (C.M., 
4..8.20). 

The scheme was to be based on the identifica- 
tion of a “minimum selling value” for the pro- 
duct. This revenue was to be allocated to the 
various organisation sub-units. The selling divi- 
sion whose standard costs, although constituting 
10% of sales revenue in the past, were to be re- 
duced to encourage them to: 


“recover the other half of their expenses out of the differ- 
ence between Minimum Selling Value and Average Sel- 
ling Value” (C.M., 4.8.20). 


The remaining 95% of the minimum selling 
value was to be termed the merchandise value 
which was to be allocated 12% to the wages of 
capital, 26% for raw materials, 32% for wages, 
13% for depreciation and 12% for administra- 
tion. These figures were to contain an allocation 
of factory overheads amounting to 150% of 
direct wages. Not only has every item of expen- 
` diture been seemingly scrutinised as to its cur- 
rent and desired level but the likelihood of a 
bonus payment has also become dependent 
upon employee participation in the reduction of 
costs. 


The employee response 

Whilst it is apparent that the bases of the 
scheme had been determined in advance and the 
employee representatives might have consi- 
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.dered themselves; therefore, confronted by a 


“fait accompli”, they did offer a number of obser- 
vations. These hinged about the magnitude of 
cash recepts labour might receive from the oper- 
ation of the scheme and in particular focused on 
the relative payments to labour and capital. 


“A point which was raised; was, as to what was the Capital 
which was to be entitled to the minimum return spoken 
of. Was it the present value of the buildings, machinery, 
plant, etc.? Mr. C. G. Renold said that this was a matter 
which could be more closely examined at a later stage” 
(CM., 19.7.20). 

“Establishment Charges” were too high relative to “pro- 


ducing wages’ .. the opinion expressed in the eyes of the 
shop floor” (C_M., 19.7.20). 


This line of questioning continued since it was 
minuted that: 


“Referring to the point as to why Capital should have any 
share of the surplus at all, Mr. C. G. Renold said that 
financial experts held the view — which could be well 
substantiated — that it was pratically impossible to get 
people to lend their money at a fixed return, except 
when there was a greater measure of security than there 
could be with ordinary shares. Therefore it was necessary 
to hold out the possibility of more” (C_M., 19.7.20). 


The determination of the appropriate rate of re- 
turn for capital (its wages) became a key issue 
for the early committee meetings with C. G. 
Renold being compelled to explain what he con- 
sidered to be the required level: “this return 
must be on the real Capital of the business ...” 
where present balance sheet valuations did not 
represent “... anything like the valuation which 
would have to be paid, if — say — they were 
being sold to a new company” (C.M., 20.9.20, 
emphasis in original). It was finally proposed by 
management that the “wages of capital” should 
be 10% of a capital base of £569,000 of which 
£221,000 represented a revaluation of the as- 
sets. The “wages of capital” were subsequently 
fixed at an annual charge of £57,600 which rep- 
resented a prior claim on the monthly “profits” 
of the Company with any subsequent surplus 
being distributed on the basis of 70% to labour 
and 30% to capital. Given that the Company had 
made a trading loss in 1919 (Tripp, 1956, p. 119) 
this represented a considerable burden on the 
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scheme’s potentiality to generate cash bonuses. 
Although the employee representatives on 
- the committee had not, as yet, generated the 
continuity to their arguments as was to charac- 
terise later meetings, they sought to explore the 
parameters of legitimate discussion within the 
committee. At the October meeting they intro- 
duced questions concerning whether tools used 
in production ought to be purchased from out- 
` side or manufactured internally: 


“~ the question was asked as to what would be the stand- 
ard of the permanent Profit Sharing Committee in 
criticising matters of this sort. Would its views be taken 
into account by the Directors? 

Mr. Jenkins stated that his personal view was that the Pro- 
fit Sharing Committee would not have any executive 
powers. At the same time it would be given freedom of 
discussion and criticism of anything appertaining to the 
success of the business, and he had no doubts that its 
views would have due weight with the Directors” (C_M., 
14.10.20, emphasis added). 


This employee probing of the scheme resulted 
in a series of discussions concerning specific 
technical details relating to the scheme. The 
managerial recommendation, for example, that 
the depreciation to be charged should be on the 
original cost of the assets rather than their net 
book values, resulting in a higher charge to the 
profit-sharing accounts, prompted the following 
employee comment: 


“it was not altogether fair from the employees standpoint, 
and suggested, as an alternative, that if the depreciation 
on the standard life of the machinery feli short (as it 
must) of the amount required to replace it on an up-to- 
date basis, the employees should be given the opportun- 
ity of putting in loan-capital for that particular purpose, 
this loan capital to recetve the same rate of wages and the 
same rate of surplus as the capital which was the property 
of the present shareholders” (C_M., 21.10.20), 


The labour interest in this issue, with future 
financial relationships not being considered as 
necessarily identical to the current ones, con- 
tinued: 


“Another member of the committee thought that money 

required for the replacement of machinery over and 

above the amount written into the Profit-Sharing ac- 

counts should come out of capital’s share of the surplus” 
- (CM., 21.10.20). 
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At the final meeting of the provisional Commit- 
tee, perhaps as a direct result of the managerial 
surprise at the labour questioning of their 
policies, C. G. Renold clearly reiterated the role 
of the Committee: 


“the Committee would be consulted before a new policy 
was embarked upon but ... the Directors must safeguard 
the position that they could always go to the public for 
money if it were wanted — which would not be the case 
if the Directors had devolved their powers of control to 
a committee of employees” (C.M, 15.11.20). 


This was dramatically illustrated when the rule 
underlying the allocation of the selling price to 
organisational sub-units which originally stated: 


“that particular allocations made against it shall be so 
made only with the approval of the Board of Directors 
-and with the agreement of the Profit Sharing Executive 
Committee.” 


was Changed to: 


‘and with the previous knowledge of the Profit Sharing 
Executive Committee” (C.M., 15.11.20, emphasis 
added). 


Prior to the formal introduction of the profit- 
sharing scheme in January 1921 and with the 
exploratory meetings of the committee having 
now come to an end, C. G. Renold addressed a 
mass meeting of employees to explain the 
scheme. 


“What can the worker expect to get out of the scheme? I 
answer — a) in the first place, a first hand knowledge of 
the business problems, and of the conduct of this busi- 
ness in particular ... 

b) The second thing the workers get out of the scheme is, 
of course, increased earnings when there is a surplus 
available ... 

In the past we have not (on the average) made large 
enough profits to pay out any bonus. Capital, in our case, 
has not had anything more than a market rate of return. 
The possibility of paying a bonus must, therefore, depend 
on our making savings in the cost of production over and 
above what has been done in the past ... At this point I 
must throw in a word of caution. The surplus, out of 
which the bonus is to be paid, is the difference between 
the expenses of running the business and the price we can 
get for our goods. The surplus can be made greater either 
by reducing our expenses or by increasing our prices. 
On the other hand, the surplus may be reduced or wiped 
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out by our having to reduce our prices even though we 
have also reduced our expenses by economy in produc- 
tion. It is unfortunate that this scheme is being launched 
just at a time when trade ts bad and likely to become 
worse. If we are to sell enough chains to keep the works 
busy, we shall have to reduce our prices, and this will 
make it difficult to produce a surplus until trade im- 
proves. Under these circumstances, what the workers 
can do is, by reducing costs, enable us to reduce our 
prices to such a level that the amount of employment in 
the Works may be kept up to the present level Ifour pre- 
sent prices have to remain in force, we shall have to re- 
duce our manufacturing programme and consequently 
the number of workers. You can look at it this way: In 
good times there will be a surplus of which the workers 
will get the larger share. In bad times, if our scheme leads 
to economy of working, we shall be able to avoid un- 
employment or short-time by being able to lower our 
prices and so stimulate buying. I must say that some con- 
siderable reductions of price are now being arranged and 
will actually come into force before the scheme begins to 
operate” (C.M., 18.12.20, emphasis in original). 


The first part of the speech repeated the argu- 
ment used by C. G. Renold in previous commit- 
tee meetings that the scheme coupled the offer 
of cash payments with the opportunity for 
employees to become more familiar with “busi- 
ness problems”, emphasising the continuing 
theme of the profit-sharing exercise being tied 
into the notion of employee education. C. G. 
Renold also reiterated that the scheme was quite 
explicitly intertwined with cost control and 
with the dependence of cash bonuses upon sav- 
ings resulting from it. Furthermore, C. G. Renold 
again stressed the importance of the selling price 
of the product as a crucial determinant of the 
likelihood of bonus payments allowing him the 
opportunity to reinforce the idea of the Com- 
pany being subject to external pressures and 
constraints. However, there was also in the 
speech modifications and elaborations of the 
thinking behind the scheme which had not been 
made explicit in the introductory discussions in 
the committee meetings. Firstly, there was an 
explicit managerial statement that although the 
scheme and the possibility of bonus payments 
were being linked to increased efficiency in pro- 
duction, any such savings mights be channelled 
into price reductions rather than bonuses. There 
was, therefore, to be no explicit link between 
labour performance and cash rewards. Secondly, 
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an additional element was introduced for con- 
sideration by labour of the relative costs and 
benefits from participation in the scheme. The 
scheme might help avoid labour redundancies 
otherwise necessary under the conditions of an 
hostile competitive environment. 


THE SCHEME COMMENCES 


The Hans Renold profit-sharing scheme was 
conceived by management as a means of bring- 
ing into an operational format the opportunity to 
confront a number of issues and problems facing 
the Company. The financial recruitment of 
labour into a joint dialogue where C. G. Renold 
would have the opportunity to pursue the 
“education” of employees into the realities of 
business and into the legitimation of managerial 
prerogatives were clearly consistent with his 
own interests. The Company’s traditional con- 
cern with employee welfare and a continued 
commitment to accounting and scientific man- 
agement had created a behavioural and techni- 
cal infrastructure which was both ideologically 
and operationally congruent with the scheme’s 
emergence. Furthermore, its mechanics by plac- 
ing a clear emphasis on cost reduction and mar- 
ket forces coupled the scheme with the search 
for improvements in corporate performance. 

When the scheme was introduced formally in 
1921 the United. Kingdom was on the verge of 
one of the worst economic collapses ever, with 
output, incomes, employment and exports all 
falling (Aldcroft, 1970). Under such circum- 
stances it is tempting to forge an automatic link 
between environmental pressures and the Com- 
pany’s responses, with the profit-sharing scheme 
as the facilitative vehicle. From such a perspect- 
ive, the operational mechanics of the scheme 
and the priorities it sought to advance are per- 
ceived as reflecting some essential or 
indisputable purpose with management having 
little or no role either in their formulation or in 
the choice of methods for their resolution. This 
denial of the real element of discretion which 
management possess in such situations became a 
central theme in the committee’s discussions as 
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the scheme became progressively intertwined 
with other organisational activities and work- 
shop phenomena. 

Management had clearly sought the scheme to 
operate in a number of organisational arenas 
impinging upon corporate performance with ac- 
counting control, marketing policy and the im- 
provement of management—labour relations 
being central to such endeavours. It is critical, 
however, to appreciate the extent to which the 
scheme became entangled with other aspects of 
organisational life. In 1929, by which time the 
profit-sharing scheme had been operating for 9 
years, C. G. Renold gave a detailed description of 
the Companys accounting control system, defin- 
ing it as “one which has been under develop- 
ment for the past ten years (and) has been 
evolved from within the organisation in order to 
meet the felt needs of management” (C. G. Re- 
nold, 1929b, p.1, emphasis added). The system 
_contained many similarities to the data require- 
ments and accounts of the profit-sharing 
scheme. Cost classifications, account headings, 
the basis of responsibility allocation, time period 
of feedback and the process of budgetary review 
were identicial for both. More generally, C. G. 
Renold in an overview of the Company’s pro- 
gress over a prolonged period of organisational 
change, gave some details of the success of man- 
agerial strategies to improve factory perform- 
ance: 


“The period of fifteen years between 1913 and 1928 has 
seen a revolution in our manufacturing processes, cleri- 
cal and statistical methods, and in our organisation. ... The 
1,807 people in 1928 ... produced in value nearly two- 
and-a-balf times as much per individual as the group of 
1,256 people in 1913” (C. G. Renold, 1928, pp. 599-602, 
emphasis added). 
Although it is impossible to be precise as to the. 
specific contribution the profit-sharing scheme 
made to this improvement in performance, it 
will become apparent from the committee meet- 
ings that the employee representatives coupled 
the two phenomena. They considered that, at 
least in part, the scheme had been used by man- 
agement as a means of securing changes in work 
method, greater cost control and increased man- 
agerial surveillance of the production process. 
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Managerial adjustment to the scheme ` 

Although C. G. Renold had initially established 
the scheme and the committee as a forum for co- 
operative discussions about the Company’s 
problems and the general promotion of an 
educative programme for labour, this was not to 
be easily achieved. l 

The financial results of the first ionis opera- i 
tion of the-scheme immediately generated con- 
troversy. A deficit of S293 was reported for the 
scheme. This was explained as follows: 


“we had arrived at a stage when prices were dropping 
considerably and would have to face further reductions 
in a few months time... a proportion of the anticipated re- 
duction would be charged into the accounts period by 
period” (CM., 3.3.21). 


Furthermore, it was proposed that a notice be 
posted in the workshop stating that: 


“There have been evidences of increased efficiency of 
working but the results have been absorbed by the unav- 
oidable effects of bad trade, which is outside the control 
of anyone” (CM., 3.3.21). 


However: 


“The notice was held up on the Chairman of the 
Employees’ Section of the Committee reporting to the 
Directors that the Committee had some misgivings about 
the accounts, and did not feel able to take on the respon- 
sibility of explaining them as they have been presented” 
(CM, 3.3.21). 


The labour dissatisfaction with “the absorption 
of savings made in the works by price reduc- 
tions” and the “extra efforts on the part of the 
workers should only result in safeguarding the 
Wages of Capital” had therefore led to a situation 
whereby the very first financial accounts of the 
scheme had generated disagreement. Without 
the endorsement of the accounts by the 
employee representatives the scheme would fall 
into disrepute. 

The seriousness of this situation can be 
gauged by the fact that the very next day a meet- 
ing of the committee was held (whereas the con- 
vention was for monthly meetings) when, after 
some discussions among senior management: 
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“C. G. Renold stated that as requested by the Executive 
Committee the Board had reconsidered the question of 
making provision for falling stock values in the fiscal 
period accounts. The provision was made for ordinary 
business reasons. It had gone into the Company's ac- 
counts and nothing had happened since which indicated 
that the policy was anything but wise. The Board, there- 
fore, could not modify the policy because of the exis- 
tence of the profit-sharing scheme ... 

The scheme was fundamentally a profit-sharing scheme 
and not one of payment by results. The payments made 
under the scheme could not be in immediate and direct 
proportion to effort ... 

Mr. Dean did not seem satisfied. He held the view that the 
scheme was working wholly in the direction of stabiliz- 
ing capital and asked what guarantee the workers would 
have that, when the bottom of the market was reached as 
regards reduced prices, the management would not mod- 
ify the scheme to suit some new conditions that would 
have arisen” (CM,, 4.3.21). 


The financial performance of the scheme was, 
however, to improve. The subsequent results 
over the next three months all reported 
surpluses with bonus payments of £1676, 
£1166 and £2155 being paid to employees. This 
was equivalent to an increase in the monthly 
wage bill of approximately 8%. This occurred 
at a time when the economic recession was 
devastating economic performance, with price 
reductions of between 10% and 33% being an- 
nounced on all the Company’s product lines, and 
when national negotiations were taking place for 
a reduction in the standard engineering wage 
rates. The managerial response was as follows: 


“Commenting on the general result C. G. Renold stated 
that by reasons of such a large part of production having 
gone into stock, it was probable that in the aggregate, on 
the Committee's basis of reckoning, there would be a loss 
at the end of the year, and that in spite of the precautions 
which were taken and which were the subject ofso much 
discussion earlier on, there would be an actual shortfall 

-on the Wages of Capital, which meant that payments had 
been made which were not justified by real profits” 
(C.M. 23.5.21, emphasis added). 


The long term managerial solution to the prob- 
lem of the completion of work-in-progress in- 
jecting a disproportional amount of added value 
to the scheme as it was transferred into finished 
stock was to change the basis of stock valuations, 
with work in progress and finished stock both to 
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be valued on the same basis. Furthermore, an ad- 
ditional provision of £1250 per month for falling 
stock values was to be incorporated into the 
scheme’s accounts (C.M., 19.9.21). 

In the shorter term, however, the salience of 
these bonus payments at a time when engineer- 
ing wages were falling was noted. C. G. Renold 
argued that: 


“if new district rates were arrived at as a result of the 
negotiations now in progress in the engineering trade, 
modifications would come into the scheme and would 
necessitate an alteration in the scales which governed the 
periodic distribution under the scheme” (C.M., 20.6.21). 


This, however, was not uncritically accepted by 
an employee representative who retorted: 


“that corresponding with a reduction of worker’s wages 
there should be a reduction in the Wages of Capital. He 
considered that the shareholders should bear some por- 
tion of the burden of reducing costs. 

Mr. C. G. Renold said that the shareholders would be cal- 
led upon to bear any reduction in prices beyond what 
would be justified by the reduction in cost. He did not 
think it fair to put the wages of capital and the wages of 
labour on a par; the latter were practically guaranteed” 
(CM., 20.6.21, emphasis added). 


From the earliest meetings management had 
argued that the Company was a coalition of in- 
terests and that there should be an equating of 
the wages paid to capital and labour. The consis- 
tency of arguments used in committee meetings 
was becoming critical. 

Reductions in wage rates were, however, im- 
plemented. Over a two month period the total 
expenditure of the company fell by 37% (CM., 
19.9.21) with 272 workers being “paid off’ in 
December (C.M., 16.1.22) representing 25% of 
the workforce. Moreover, increased productive 
efficiency had been achieved. After labour 
promptings as the contribution of the scheme 
towards this, a management representative con- 
ceded that “he had no doubt whatsoever that it 
had considerable effect” (C.M., 4.10.21). 


The creation of multiple accounts 
With the completion of the first year’s opera- 
tion of the scheme, a new body of employee rep- 
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resentatives was elected to the committee. At 
the first meeting some indication was given of a 


change in managerial policy as regards the . 


scheme: 


“Mr. C. G. Renold said that this being the first meeting 
with the newly elected Committee, which contained sev- 
eral new members it would be well to review the objects 
in view. Broadly these were: 
1. To communicate information. respecting business, its 
financial position, general prospects etc. 
2. To present the results of the Profit Sharing accounts as 
applied to the Scheme. 
It was felt that the most constructive part was the 
former and he would like to get down to a regular routine 
for supplying what information was considered best, it 
being the desire of the Board to give whatever was 
` needed to enable an intelligent view of the situation to be 
obtained” (C.M., 27.3.22, emphasis added). 


What had ostensibly been set up as a committee 
to monitor a ‘profit-sharing scheme was im- 
mediately confronted with a managerial state- 
ment that this aspect of its work was to become 
secondary. Instead the emphasis was to be 
placed upon the more general theme of informa- 
tion communication about the Company’s 
“financial position, general prospects, etc.”. This 
could well have been simply a pragmatic man- 
agerial response to their early experiences 


within the Committee and the environmental. 


conditions which now confronted them. The 
employee representative had at no stage uncriti- 
cally accepted the financial accounts of the 
scheme’s performance as being totally reliable 
and complete. As such, the opportunity to exp- 
lain more aggregated levels of financial data deal- 
ing with overall Company performance rather 
than with the profit-sharing scheme might have 
appeared attractive. Furthermore, as a result of 
the more stringent market pressures which were 
now exerting considerable influence on the 
Company’s financial performance, management 
might now perhaps have wished to divert atten- 
tion away from the reduced possibility of bonus 
payments towards the need for the Company to 
survive this period of economic depression. 
This, however, raises the question as to how 
precisely management expected to divorce the 
deliberations about general Company perform- 
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„ance from the specific details of the scheme’s 


performance, since they had sought, via the 
mechanics of the scheme, to inextricably link 
the two. There is, therefore, some immediate 


tension in the situation whereby management 


had consistently attempted to pursue the object- 
ive of employee education by a strategy of defin- 
ing and articulating Company problems and ` 
priorities within an accounting framework and 
wished to continue to do so, but where the 
means of facilitating such an exercise — the pro- 
fit-sharing scheme’s accounts — had already 
generated some controversy and dissatisfaction. 
Some indication of C. G. Renold’s strategy for 
the pursuit.of employee education without con- 
troversy immediately became apparent. 


“A newssheet had been issued to the Committee for the 
first time — The Summary Trading Account. . 

This contained three accounts: 

1. Sales Measure Account. 

2. Merchandise Account. 

3. Company Account. 

The Company Account shows the results of the Com- 
pany’s activities as it would be presented to the outside 
world supposing it were a public company. It is different 
from the Sales Measure Account only in that the stock is 
valued at a lower figure (prime cost) ... this account is of 
little use for judging the week by week activity because, 
at a time when the bulk of the production was being 
made for stock (which is valued at less than half its 
current selling value) the credit for work done would be 
actually less than what it had cost. For this purpose an ac- 
count is required which takes the production into ac- 
count, whether sold or not, at the same level The Sales 
Measure Account does this. The Merchandise Account... 
is computed on exactly the same basis as the Sales Mea- 
sure Account but includes the Wages of Capital provision 
as an expense. This is where planned and actual costs are 
computed” (C.M., 27.3.22, emphasis added). 


A multiplicity of accounts with seemingly dif- 
ferent managerial objectives had been created. 
Whereas the Company accounts were described 
broadly in line with what one might consider as 
conventional accounting practice, the sales mea- 
sure and merchandise accounts served to em- 
phasise not the reporting of aggregate financial 
performance to outside agencies but the dis- 
aggregation and appraisal of performance by the 
periodic control accounts. Furthermore, ‘the 
inclusion of the wages of capital in the merchan- 
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dise account facilitated the calculation of dis- 
tributable bonus payments under the terms of 
the scheme. As such, three quite specific ac- 
counts had been created to assess corporate, 
control system and profit-sharing performance. 
Whilst the performances of these three areas 
were undoubtedly interrelated and indeed the 
three accounts shared essentially the same 
underlying accounting data base, any level of 
. technical ambiguity and discretion would per- 
mit different perspectives and resolutions of 
problems as the situation demanded. Moreover, 
the opportunity also existed for management to 
couple and decouple the relationships between 
the accounts, with, as will be shown, manage- 
ment at one time emphasising the relationship 
between the Company account and the. mer- 
chandise account as an explanation of Company 
losses and the lack of bonuses, whilst at other 
times detaching bonus payment from corporate 
performance, preferring instead to explain the 
lack of bonuses within the constraints estab- 
lished by the control system. 


A GROWING CRITICAL APPRECIATION BY 
LABOUR OF THE PROFIT-SHARING SCHEME 


Accounting, scientific management and the 
basis of the scheme 

From May 1921 to January 1923 the scheme 
was unable to generate bonus payments. 
Throughout this period the micro-foundations 
of the scheme were subjected to intense labour 


scrutiny despite C. G. Renold’s wish that more. 


general business concerns should be considered. 


central. In July 1922, C. G. Renold took the op- 


portunity to present a comprehensive review of 
the accounting assumptions underlying the 
scheme. This incorporated a breakdown of the 
cost scale allowances and was supplemented by 
an explanation as to the sources of information 
used in the calculations. Extracts from this analy- 
sis are shown in Fig. 3. 

There are a significant number of points 
which the analysis in Fig. 3 makes very clear. 
Firstly, the complexity of the scheme was 
perhaps its most striking feature. The managerial 
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desire to ensure that each and every unit of the 
business was set targets and had its performance 
measured and monitored introduced an element 


` of substantial computation intricacy. A facade of 


precision is built up portraying not only the pro- 
duction process but the entire business as a care- 
fully and scientifically structured edifice which 
transcends dispute and question. No aspect of 
the business is spared the stop watch, the calcu- 
lation or the blueprint except the strategic 
power to initiate such an exercise. Yet within 
the detailed computations there exists inner ten- 
sions: on the one hand there is reference to “time 
study” and “drawing office calculations” yet at 
other times reference is made to “allocations 
being approximate only”, to items not adding up 
to 100 “except by accident” and to recognitions 
that “it is very unlikely this figure will be at- 
tained”. A curious mixture of subjective alloca- 
tions being computed to two decimal places 
when the allocation itself is explicitly based 
upon some surrogate causal variable. This is an 
attempt at management by facts, by calculation 
and by logic; an attempt to convince labour of 


the legitimacy of managerial prerogatives on the 
basis of ability, knowledge and expertise whilst 


‘simultaneously using the data available as a 


means of raising the visibility and hence control- 
lability of each employee. 

Secondly, the focus of the examination is on 
the micro-foundations of factory performance 
and their impact upon the scheme. The relation- 
ships between sales, output levels, cost targets 
and cost incurrence are summoned by manage- 
ment as explanatory variables for the scheme’s 
mechanics, with the emphasis firmly placed 
upon cost classification, analysis and control as 
the determinants of profit-sharing bonuses. No 
element of managerial discretion is admitted: 
there was instead a network of causal and cal- 
culative “truths” underpinning the integrity of 
the scheme and rendering management as the 
impartial upholders of a divine reasoning. 

Although this presentation of data did not 
generate any labour response at that stage, its 
subsequent revision 6 months later generated 
considerable controversy. This change in the 
scale took place at the same time as an increased 
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tycle Chain scale of Expense 





Pence per Approx Pence per Approx 
foot % foot % 
1922 prices 1923 prices 
1. Raw Materials 2.4 18.24 2.4 22.01 
2. Wages 2.0 15.19 2.0 18.81 
. 3. Depreciation & Supplies 0.8 6.07 0.38 7.52 
(40% on wages) 
4. Semi Production 
Deficit 1.8 7.59 1.0 9.40 
(50% on wages) 
5. Administration 1.0 7.59 1.0 9.40 
(50% on wages ae ad Ca 
FACTORY COST ` 
ALLOWANCE 7.20 54.60 7.2 67.73 
6 Wages & Capital 2.32 17.70 1.80 16.90 
7.  Uncontrolisble Expense 0.33 2.50 0.30 2.7 
B. Variable Factor 2.65 20.01 0.80 7.52 
MERCHANDISE VALUE 12.50 35.00 T60 35.10 
9. Sellin 0.66 5.00 0.53 5.00 
"MINIMUM SELLING VALUE Bie 100-06 6.63 100.00 
‘1. Rew Materials - definitely calculable and consits of Drawing Office 
calculations (which includes ordinary scrap) + 10% of extra allowance. 
2. Wages -~ based on production of 20,000 feet a week ~ from time study 
investigation of May 1922. 
3. Desrectatlon & Supplies : 
40% of wages cost _ 13% Depreciation 
_ 27% Supplies 
Actual Performance oose 14% Depreciation 
in Period 13. ---- 42% Supplies 
4.  Sernl Production Deficit 
50% of wage cost 
5. Administration 
“An attempt is made to allocate the whole cost of administration among ths 
different production groups more or less In proportion to the time that is 
actually spent on them. Naturally these allocations can be approximate only." ‘ 
6. Wages of Capital 
7.  Uncontrollable Expense 
. 8. Selling : 
: “These items are obtained by DEDUCTION from the list price, subject to 
certain adjustments as follows: 
i Pence per foot, 
List price at 1922 price 22.00 
Deduct Maximum discount of 40% 8.80 
Minimum Selling Value 13-26 
, Deduct Selling Allowance (5% of MSV) 0.66 . 
I Merchandise Value {2.56 
Oeduct Wages of Capital* 2.32 
Uncontrollable Expense 0.33 
i 2.65 
555 
9. Variable Factor . 


level of general austerity in the factory was being 

introduced, with further selling price reductions | 
_of between 5% and 15% being implemented to 

stimulate sales (C.M., 6.11.22) and further wage 

reductions of between 8% and 25% being im- 

posed on all employees within the factory (C.M., 


4.12.22). 


“The full scale being compiled partly by deduction (from Selling Prices) and 
partly from different cost estimates, it follows that the items will not add up 


to 100 except by accident and the difference between the two is designated 


the variabile factor.” 


: Fig. 3. Cycle chain scale of expense. 


In addition, no bonus payments had been paid 
for 18 months. The labour response to this ac- 
‘cumulation of events was to refer back to an item 
.which had been introduced in the detailed 
.analysis of costs six months earlier (Fig. 3). This 

related to the term “variable factor”. The follow- 
‘ing interchange now took place at this meeting: 
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Labour Representative (Mr. Parrish) 


“This was taken to be the measure of profit which the- 


new programme could afford, part of which, it was as- 
sumed, would be available for periodic distribution, and 
it was disturbing to find this item was included in the 
non-distributable group. 

Management Representative 

“Mr. Jenkins argued that the first principle of the Scheme 
as regards the periodic accounts was to set a reasonable 
task and the allowances were never intended to be any- 
thing else than a practical estimate of the cost of produc- 
tion. The function of the Scheme was to distribute any 
surplus that may arise by the cost being reduced through 
improved efficiency. The ‘variable factor’ was merely 
the difference between what an article could be pro- 
duced for and the price which would be obtainecd for it. 
It had no relation to the setting of 2 production task and 


was, therefore, left out of the account until the end of the 


year when, if there was a real surplus, what there was 

would be divided in the ratio set by the Scheme. 

Labour Representative 

“As regards this, the members of the Committee stated 

that the real significance of that item, i.e. variable factor, 

had not been realised before and its presence appeared to 

indicate a departure from the principles of the Scheme as 
. understood in the beginning. 

Management Representative 

“Mr. Jenkins said that the Scheme did not purport to 

distribute periodically what might be called the ‘com- 

mercial’ profit but only surpluses created by the effort 

beating the task” (C_M., 8.1.23, emphasis added). 


Having already observed a level of technical 
ambiguity permeating the scheme, with a multi- 
plicity of accounting measures and statements 
being created to widen its range of responses to 
various organisation demands and managerial 
objectives, it appears now that even the defini- 
tion of the scheme itself and the principles upon 
which it was based were open to different in- 
terpretations at different times. The managerial 
statement is an explicit recognition that the de- 
scription of the scheme as profit-sharing is con- 
ceptually misleading. In this context Mr Jenkins 
is keen to emphasise its similarities with a pay- 


ment by results system whereby no claims to the’ 
distribution of “commercial profit” exist. From, 


such a perspective, the employee’s interest in 
the “variable factor’, the 7.5% of the standard 
scale which was originally defined as simply the 
“accidental” product of arithmetical averaging, 
but which later came to be “merely the differ- 
ence between what an article could be pro- 
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duced for and the price which could be obtained 
for it”, is deemed irrelevant. Yet-if one contrasts 
this with the committee meeting of March 4 
1921 when the labour interest was in the re- 
ported efficiencies in the factory which had 
“been absorbed by the unavoidable effects of 
bad trade”, the labour concern with the lack of 
corresponding financial reward was ruled as in- 
admissible on the basis that “the scheme was fun- 
damentally a profit-sharing scheme and not one 
of payment by results. The payments made 
under the scheme could not be immediate and 
direct proportion to effort”. As such, different 
definitions of what consituted the scheme were 


‘used by management to respond to different 


lines of enquiry by the employee representa- 
tives. 

This ambiguity of definition is of considerable 
significance. An emphasis on the profit-sharing 
aspect of the scheme could be summoned as a 
justification for the Company as a coalition of in- 
terests, for the rights of capital to its wages, and 
for the inclusion of selling prices-and sales quan- 
tity as key variables in the scheme, influencing as 
they did corporate profits and corporate survi- 
val. Under a payment by results scheme, how- 
ever, with its underlying theme of scientific 
management, clearly defined parameters of 
power and responsibility are established with 


‘employees being evaluated against precisely 


specified standards. Here the uncontrollable var- 
iable is excluded from performance measure- 
ment and the financial rewards paid to labour are 
in a direct and unequivocal relationship to effort. 
Such criteria had never been comprehensively 
applied to the scheme, although management 
bad specified previously that the standards were 
not based on “guess-work” but were “founded 
on facts and figures” (C.M., 4.8.20). Now the 
standards “were never intended to be anything 


‘else than a practical estimate of the cost of pro- 


duction” with even the term “variable factor” 
standing in stark contrast to the earlier terminol- 
ogy of science and precision. In other words, this 
entangling of profit-sharing, payment by results | 
and the different assumptions upon which each 

were based, provided management with an array 

of assorted principles, priorities and explana-. 
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tions for the scheme which acquired their 
rationale.not from any inherent conceptual logic 
but from their usefulness to management as 
selected responses to particular situations. 


Scrutinising the accounts 


In April 1923 214 million parts involved in the 


-manufacture of motor cycle chains were scrap- 


ped. In response to questions asked by 


- employees, management made the following 
‘statement: 


“Dealing with the point raised by the Committee, that fai- 
lure to charge this loss wholly within the period in which 
it would be incurred would be a breach of the principle 

` of ‘each period’s account standing on its own’. Mr. H. G. 
Jenkins stated that it had never been the practice, or in- 
tention, to charge widely fluctuating expense items 
within the period in which the moncy was actually spent 
or in which the debit, as in the present case, became 
known” (CM, 23. 4, 23). 


The employee representatives were, nadore 3 


by this stage acquiring an increasingly sophisti- 
cated grasp of the mechanics of the scheme and 
of the impact of various accounting methods on 
bonus payments. Their argument was that the 
scheme’s well established principle of “each 


period’s account standing on its own” demanded ` 
that the exceptional losses incurred in that 


period from the wastage of the materials should 
be written off in that period and not be allowed 
to distort the relevant accounts of subsequent 
period for performance evaluation purposes. 
The managerial version of events takes a diffe- 
rent perspective on the problem, and can be 
viewed as an equally legitimate interpretation of 
the effects of the problem upon the Company’s 
performance. Their view was- that since the 
losses were incurred over several periods and af- 


- fected several period’s performance then this 


should be reflected in the accounts. Perhaps the 


interesting issue, therefore, is not so much the 
relative merits of the accounting principles 


involved per se, or the definition of “good ac- 
counting practice” within the parameters of the 


‘technical itself, but how management and labour 


sought to introduce meanings and descriptions 
of organisational events which when translated 
into accounting practice would promote their 


respective views of “reality”. 

The ability of the scheme’s accounts to pro- > 
vide a description of factory events which con- 
formed to the labour perception, and indeed 
managerial anticipation of performance was re- 
turned to at the same meeting. The financial 


‘results of the scheme for the period were pre- 
‘sented and it was announced that in the camshaft 


department a surplus of £511 in the previous 
period had now become a deficit of £419 in the 
current one. This was explained as follows: 


“As regard the Camshaft Department there were partial 
explanations of the changeover; for instance £480 was 
included in the charge for ‘material’ for bushes bought 
from outside and the nightshift costs for bush bending 
was excessive having regard to the production due to the 
employment of youths instead of girls, but these were not — 
sufficient to account for such a considerable difference. 
The Directors were not convinced that the account rep- 
resented any real change in the situation and was rather 
a matter of ‘adjustments’. 

Mr. Jones said that he and the Superintendent of the de- 
partment were extremely surprised at the result because 
so far as they could judge everything was going swim- 
mingly. He said that the Superintendent had just received 
his account and was under the impression that there had 
been a change in the formula” (C.M., 23.4.23). 


At the next meeting it was reported: | 


“The only question asked in the meeting was in regard to 
the Camshaft Chain figures on the trading statement, #1 
which it appeared there were some errors” (C.M. 
16.5.23, emphasis added ). f 


It was not, however, only the descriptive 
ability of the scheme’s accounts which came 


‘under intense labour scrutiny. Closely related to 


this was the recognition bythe employee rep- 
resentatives that the financial and operational 
strategies determined by management had a sig- 
nificant effect on the scheme and that the 
scheme’s accounts provided a suitable means for 
their interrogation. 

A point was taken up by the committee which 
had first been raised 6 months earlier when man- 
agement had announced that approximately 
25% of sales were now exports and that these 
were to be priced at a special rate of 10% below 
the minimum domestic price (C.M., 26.3.23). As 
more information became available the 
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employee representatives began to consider the 
effect of such a marketing strategy upon the 
scheme: 


“Mr. Parrish asked a question in regard to foreign sales. It 
had been stated by the Finance Director that Foreign 
Sales in Period 3 other than through the subsidiary com- 
panies were made at a margin above the minimum selling 
value of 6 or 7% whereas those made to subsidiary com- 
panies were at 11% below minimum selling value. Mr. 
Parrish asked the reason for the wide difference between 
the prices at which we sold for export to subsidiary com- 
panies and to others” (C.M., 8.9.23). 


The labour worries that substantial sales dis- 
counts reduced the likelihood of bonus pay- 
ments and that the location of profit recognition 
seemed to offer management the potential to ar- 
bitrarily influence the profits of the parent com- 
pany were not immediately resolved. Later, 
however, it was agreed that a proportion of the 
profits of subsidiary companies should be 
brought into the pool of distributable monies. 


Exposing the alleged scientific basis of 
managerial decision-making 

It was not, however, only the profit-sharing ac- 
counts which were exposed to labour scrutiny. 
Progressively, labour attention began to focus 
on the very basis of managerial decision-making 
itself. The impetus for this challenge arose when 
management reported that: 


“Board Minute No 305.4 Decision 2C(16.1.24) affirms 
the policy of getting work done for the Plant Group by 
outside firms where possible ... 

The policy also tends to much greater expedition, and. 
therefore probably lower cost, because the company 
could not put as many men on to any job as an outside 
contractor at any one time ... and tt is boped that a grea- 
ter efficiency will result from the adoption of this policy’ 


which calls for a smaller permanent staff sufficient only’ 
for the general maintenance” (CM, 24.3.24, emphasis: 


added). 
However: 


“Mr Moss of the Shop Stewards bad discussed the last 
statement on contract work and found it unsatisfactory .. 
There were several points in the statement to which ex- 


ception might be taken but for the present they proposed ` 


to confine themselves to judging the position from the 
economic standpoint They were in a position to see 


‘ 
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what was going on and it was their belief that much of ' 
the work that was done by outside concerns could not 
possibly be done as well nor as cheaply as by the inside 
staff No proof bad been supplied that tbis is not so” 
(C.M., 22.4.24, emphasis added). 


In response to this statement C. G. Renold reiter- 
ated his views on the benefits of outside contrac- 
tors but again without providing any numerical 
“factual” analysis. As such at the next meeting: 


“C G. Renold said that it had not been possible to get the 
particulars of cost comparisons to show the effects of 
this policy, chiefly for the reason referred to at the last 
mecting ... so it is not possible to obtain the true cost of 
2 particular job.” 


Consequently: 


“Mr Dean said that the attitude of the Shop Stewards 
Committee was that Hans Renold Ltd employees were 
working short time and losing money while the 
employees of outside concerns were getting a full week 
on work which our own employees could well handie” 
(CM., 19.5.24, emphasis added). 


The salience of this managerial inability to con- 
vince the employee representatives of their 
policy without the prop provided by quantita- 
tive and financial information gradually became 
coupled to-a more general employee unease 
with the performance of the scheme and in par- 
ticular the lack of bonus payments. 


“Mr Dean speaking on bebalf of the sbop workers gener- 
ally, said they had become a little anxious concerning 
the success of the scheme. From the production pointof 
view it bad been an unqualified success, that ts to say, 
it gave the management the opportunity of studying op- 
erations and setting schedules which would not be pos- 
sible otber than on a piece rate basis. By these means 
manufacturing costs had been reduced to a minimum and 
the general efficiency had reached a very high standard. 
It was logical in view of these factors that the workers 
should receive something more than promises” (C.M., 
3.11.24, emphasis added). 


C. G. Renold responded that: 


“The Directors must safeguard such a return on the real 
capital employed as would attract outside money if it 
were needed. This was only ordinary commercial pru- 
dence” (C.M., 3.11.14). 
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However, the persistence of labour pleas for a 
thorough reappraisal of the scheme’s ability to 
generate cash payments finally prompted the fol- 
lowing managerial statement: 


“Mr C. G. Renold stated that the Directors had not been 
able to get at this question and saw no prospect of being 
able to give it the necessary study for a good many 
months to come. It was therefore proposed to suspend 
the question ‘sine die’. The Directors would be unable to 
consider any change ... in the basis of the scheme for an 
indefinite period. ... The Directors desired to appeal to 
the Committee, and through it to the whole place, to as- 
sist in warding off questions or investigations” (C.M., 
5.1.25, emphasis added). 


This plea was, however, unsuccessful as a gen- 
eral labour unease with the scheme became in- 
creasingly obvious. Accusations of “exploitation 
and rate-cutting” were mentioned and that “the 
accounts for the purposes of the scheme should 
be the same as those used for the management of 
the company” (C.M., 26.1.25). At the same meet- 
ing C. G. Renold replied: 


“that the Committee's action in bringing forward this 
question immediately following his call for a period of 
tranquility had forced the Directors to reconsider their 
attitude to the Scheme but before coming to a decision 
they would like to have the views of the Committee. He 
said that the Board’s attitude was somewhat as follows. 
The existence of the Scheme laid, without doubt, a heavy 
burden on the management ... 

He went on to say that so long as the Scheme really did 
something to improve the harmony and unity of effort 
the burden of administrating it was well worthwhile, but 
the Directors were begininng to feel that the Scheme was 
discredited in the eyes of the workers and was therefore 
producing doubt and contention ratber than harmony” 
(CM., 26.1.25, emphasis added). 


C. G. Renold then went on to define more pre- 
cisely what he meant by the additional “burden” 
to management of the profit-sharing schemes: 


“The ‘strain’ which he had spoken of was much more far 
reaching than this and was a somewhat intangible one. He 
would try to illustrate it. In all discussions of policy the 
Directors had to consider two points of view, viz; the 
needs of the business having regard to outside influences 
— competition, etc. — and also bow far the policy de- 
cided upon could be explained and justified to the Com- 
mittee. He said there were a great many things the Board 
would be able to accept and act on with less explicit evi- 


dence than was considered necessary to go to the Com- 
mittee. The Directors were dealing all day long with 
general questions of management and were able to act, 
to a certain extent, on general impressions which was 
quite a sound basis of action among persons who were 
in close touch with things and with one another, but 
these impressions could not be expected to satisfy a 
Committee like the Profit Sharing Committee which 
was comprised of individuals who were not in such 
close touch The problems of guiding the business 
through the present very difficult trade situation were as 
much as the Directors could handie and to have, at the 
same time, to give their attention to other aspects which 
they would not have to do were it not for the need to exp- 
lain matters to the Committee did involve a very real 
extra strain” (C.M., 26.1.25, emphasis added). 


Management made the observation that given 
the economic and marketing problems the Com- 
pany faced, they could not reasonably be ex- 
pected to devote a disproportionate amount of 
time to the accumulation and analysis of data to 
satisfy the labour demands. However, on closer 
scrutiny the speech is of far greater significance 
than the mere statement of fact. Having traced 
the development of the scheme over a five-year 


` period, a central theme throughout has been the 


managerial aim to expose the employee rep- 
resentatives to the “realities” of business prob- 
lems, in particular, the difficulties management 
faced and the data upon which policy was 
guided and by which they were constrained. Ac- 
counting cost data, selling prices, standard allo- 
wances, standard and actual profit margins are 
some examples of the quantitative, and “factual” 
data disclosed to employee representatives. In 
addition, the entire factory had been subjected 
to the stop-watch, the scrutinisation of perform- 
ance, and subsequent restructurings of activities 
and work methods. Management had, particu- 
larly at the outset of the scheme but also as a gen- 
eral policy, encouraged the employee represen- 
tatives to think of the Company’s problems, what 
the data showed and their position in this co- 
operative venture. Yet in this speech manage- 
ment were portrayed as often having to act on 
“general impressions”, as decision-makers with 
some intimate and local knowledge of problems 
which transcended a rational data based criter- 
ion for decision-making. These were perhaps 
reasonable observations and more genuine and 


226 


realistic descriptions of the basis of actual man- 
agerial decision-making. However, this had sub- 
stantially different connotations as a basis for the 
justification of managerial prerogatives, since 
decisions taken on a personal perception or feel 
for the situation could be legitimately ques- 
tioned on the basis of alternative perceptions or 
‘alternative feels for the same situations. 


RECONSTRUCTION AND TERMINATION OF 
THE SCHEME 


Although the committee was intended by 
management as an arena for co-operative discus- 
sions and employee education this had not been 
achieved. Contentious issues relating to the 
scheme had been raised and not resolved. Whilst 
labour dissatisfaction had focused on the ab- 
sence of bonus payments in return for improve- 
ments in performance, this had often been 
defined and articulated via the scheme’s ac- 
counting principles and numbers. Controversy 
over the wages of capital, provisions for falling 
stock values, the “variable factor” and deprecia- 
tion charges can all be seen in this context. 
Furthermore, as the employee representatives 
displayed increasing expertise in the interpreta- 
tion of the scheme’s accounts, their arguments 
had exhibited more sophisticated challenges to 
the descriptive ability of the accounting calculus 
and its reconciliation with their own percep- 
tions of factory activities. Indeed, as there had 
been built into the scheme’s accounts a level of 
technical ambiguity and variety designed to en- 
hance the range of feasible responses available to 
management, this had placed a premium on their 
_ capacity to present consistent interpretations of 
performance. Management’s inability to do so 
had led them to attempt to reduce the salience of 
the scheme’s accounts in the committee’s delib- 
erations. Increasingly, however, the very 
rationale of the scheme had been exposed to the 
most clinical examination. Challenges to the 
scheme’s accounts no longer concentrated on 
their limitations but also on the potentiality they 
offered management to manipulate the data for 
their own purposes. As such, the educative 
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media management had sought to promote gen- 
erated controversy rather than agreement. In- 
deed, as the employee representatives recog- 
nised that management had very real levels of 
discretion, as with the wages of capital, with 
transfer prices, and with the use of outside con- 
tractors, the new “factual” basis for managerial 
prerogatives dissolved as the accounting num- 
bers left labour unconvinced as to their objectiv- 
ity and accuracy. 

The managerial solution to these problems 
was to attempt to remove the sources of labour 
discontent from the committee’s agenda. In 
January 1925 periodic (monthly) profit-sharing 
calculations were dropped, with the scheme be- 
coming “One to distribute annual profits when 
there are any” (C.M., 26.1.25 ). This was followed 
in May 1925 by the recommendation that 
“the chart showing periodic profitability should 
be dropped from works publication” (C.M., 
25.5.25). By the commencement of 1926 no ac- 
counting measures were being presented to the 
committee. Whilst these steps had the im- 
mediate impact of reducing controversy and dis- 
agreement within the committee. meetings, 
other aspects of corporate performance were 
soon to impinge upon their deliberations. 

The financial year ending June 1925 had re- 
sulted in a trading loss for the Company of 
£7696 and no profit-sharing bonus payments. By 
June 1926 this had been converted into a profit 
of £41,000. As the Company emerged from the 
depths of economic recession to achieve over 
the next 3 years record performance levels, 
these more buoyant trading conditions were to 
provide a critical backdrop to the operation of 
the scheme and the committee’s discussions. 


Reformulating the profit-sbaring concept 
With more favourable trading conditions 
came the possibility that substantial profit- 
sharing bonuses to employees would become 
payable. C. G. Renold considered that a detailed 
re-evaluation of the scheme was necessary: 


` “This year’s prospects are very much better and may even 
lead to a substantial payment to the workers. There is 
some element of luck in this year’s improvement, duc to 
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the present disorganisation of our competitors. This 


particular condition cannot be expected to last, and we | 


must look forward to a renewal of the keenest compett- 
tion in the future. The recent amalgamation of two ofour 
chief competitors may well make their competition more 
serious than ever before. To meet this the most ordinary 
business caution demands that we should conserve to 
the utmost all our financial resources and keep them 
available to meet whatever difficulties the future may 
hold. As is well known to this Committee, shortage of 
capital, as represented by the limit set by the Bank on our 
overdraft, has been one of the problems with which we 
have been struggling during recent years, and is still one 
of our most serious difficulties. If there is a profit this year 
which involves a distribution the cash required will have 
to come from the Bank — as part of our overdraft. This is 
undesirable as it means that Capital's ability to carry out 
its undertakings under the Scheme depends on the wil- 
lingness of the Bank to find the ready money. The whole 


matter comes back to the necessity of building up, as part ~- 


of a sound business policy, a Reserve Fund to finance 


the growth demanded by the Market” (C.M., 8.2.26, em- , 


phasis added). 


C. G. Renold decided that under such cir- 
cumstances the scheme should be terminated in 
July 1926. Of critical importance, however, is 
the clear indication that management sought to 
focus attention on the environmental factors 
which had dictated this state of affairs. External 
institutional forces constraining the scheme’s 
operation were summoned as explanatory vari- 
ables, with “the Bank” and “the Market” being 
singled out for particular mention. There was 
then the appeal to more pragmatic and common 
sense variables as “luck”, “ordinary business 
caution” and “sound business policy” as explana- 
tions for the scheme’s termination. It was, there- 
fore, not only the scheme which was to be 
changed but, as will become apparent, there was 
also to be a managerial attempt to orientate the 
committee’s attention and discussion towards 
the relationship between the Company and its 
economic environment. As the inconsistencies 
between the performance of the scheme and fac- 
tory events had provided a source of ammuni- 
tion for the employee representatives to interro- 
gate management on a wide set of Company ac- 
tivities, such a decision was perhaps not surpris- 
ing. However, the managerial desire to change 
the scheme, when as one employee representa- 
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tive commented “because at last the workers 
looked like getting something worth having” 
(C.M., 8.2.26), was to place a high premium on 
their ability to convince labour of the approp- 
riateness of the decision. 

Although the old profit-sharing scheme was 
terminated in July 1926 a new scheme was prop- 
osed by management to take immediate effect. 
The details of this scheme reflected managerial 
disappointments and concerns with what had 


transpired. The contentious items of periodic 


distributions, wages of capital, and the effect of 
large corporate profits on bonus payments were 
to. disappear, with profit-sharing payment now 
being based on dividends paid rather than profits 
earned. Under the new scheme when the 
dividend on ordinary shares reached 10%, 
employees were to be entitled to 20% of this 


‘gross distribution. Individual entitlement was to 


be “pro-rata” on wages with the important con- 
straint that 20% of the bonus was to go to the 
monthly salaried staff with the remainder to 
weekly paid employees. The new scheme was 
presented to the employee representative by 
means of the table shown in Fig. 4. 


“Commenting on the table (Figure 4) ... Mr. C. G. Renold 

emphasised that there was no definite connection be- 

tween profit made and the rate of dividend. It may be, 

for example, that the Directors would declare a 10% di- 

vidend when the profit made only £25,000 if such a year 

followed a series of good years and the prospects for the. 
future were good, on the other hand, in reverse cir- 

cumstances they might not pay 10% even ifthe profits for 

a given year were £40,000" (C.M. 16.9.26, emphasis 

added). 


As regards the new scheme it is apparent that 
the term “profit-sharing” had by now become a 
very loose description. A major feature of the re- 
vised terms make it explicit that no direct link 
between Company profitability and bonus pay- 
ments existed. Under the old scheme, irrespec- 
tive of the accounting problems of trying to 
compute some meaningful measure of profit for 
so short a time period as a month or of the effect 
on the likelihood of bonus payment of the wages 
of capital, employees could cling to the belief 
that, in theory at least, bonus payments were re- 
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Ordinary dividends 
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Probable Old 

Profit Rate */ Amount | Amount 
20,000 7 12,529 - 
25,000 9 16,109 - 
30,000 10 17,898 - 
40,000 10 -"“- - 
50,000 10 -" = - 
60,000 10 -"- 4200 
70,000 ' 12h 22,373 11,200 
80,000 2 ~"~ 18,200 


Distributable funds 





New 
Days pay Amount Monthly Others 
staff 
- 3580 10%, 5% 
~ eae TD ae ‘im. ea aos ase, 
= ~ wo nE eS p 
5% as We Pn e — Me 
15 4475 13% 7 
24% ate eT ee a 


Fig. 4. Comparison of distributions, 


lated to Company profitabililty. Now it is made 
explicit that the basis of bonus determination — 
dividends — need not have any “definite con- 
nection” with profitability. Furthermore an ele- 
‘ment of managerial discretion now explicitly 
existed to determine what was considered to be 
the appropriate bonus payment. Despite the pre- 
sentation in Fig. 4 of the relationship between 
“probable” dividend payment and new bonus 
entitlement these represented only hypothetical 
situations. Dividend policy and hence both the 
threshold level at which a bonus became pay- 
able and its appropriate amount were clearly 
under managerial controL 


The labour response to the new scheme 

The initial labour response to the new scheme 
focused on the managerial decision to allocate 
20% of any gross distribution to the monthly 
paid staff. As one employee representative 
noted, “What were the manual workers likely to 
think of a scheme which gave special benefits to 
their superintendents?” (C.M., 23.9.26). A sub- 
sequent reduction of this proportion to 16% 
(C.M., 7.10.26) and a managerial insistence that 
this now constituted a non-negotiable condition 
of the scheme, convinced labour that “if they did 
not vote for this scheme they would be left with 


nothing” (C.M., 24.1.27). 

The financial performance of the Company 
was to improve continually. In August 1927 an 
annual Company profit of £30,072 was reported 
which on the basis of a 10% dividend to 
shareholders resulted in a profit-sharing bonus 
of £3580 (equivalent to 1 week’s wage for the 
hourly paid employees and 1.7 for the monthly 
staff). 

When the results of the next financial year 
were presented in August 1928 (C.M., 3.8.28) 
they justified the optimistic interim statements 
made by management that “sales were the high- 
est on record, apart from the munition years” 
(C.M., 3.5.28), and “the production during the 
period constituted a post-War record” (C.M., 
17.5.28). A Company profit after adjustments 
and employee bonus appropriations was re- 
ported of £61,137, which was double that of the 
previous year. It was announced that since a 
1212% dividend was to be paid, this would result 
in a bonus to employees of £4576, which consti- 
tuted 1.1. week’s wages pay to hourly paid work-. 
ers and 2 weeks wages for monthly staff. Such fig- 
ures were on the basis of management's own fig- 
ures of September 1926 (Fig. 4) within the Com- 
pany profit parameters which under 
the terms of the old scheme would have merited 
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more substantial bonus payments. If the 
employee representatives felt a sense of injus- 
tice at this state of affairs (and subsequent events 
give clear evidence that they did) C. G. Renold 
presented the data shown in Fig. 5 as a justifica- 
tion. 

“The following table shows our achievement as regards 


price reductions, together with the movement of wages 
over the same period” (C.M., 5.10.28). 


PRICE INDEX OF COST OF LIVING INDEX HANS RENOLD 





RENOLD PRODUCTS (BOARD OF TRADE) WAGE INDEX 
MEN WOMEN 
1913 100 100 100 100 
1920 250 270 
1327 HO 169 
1328 10C 165 185 200 


Fig. 5. Prices, cost of living and wage indices. , 


C. G. Renold’s argument that the Company 
had increased wages in excess of the cost of liv- 
ing index whilst maintaining its viability by the 
reductions in selling prices presented a dis- 
tinctly partial overview of Company perform- 
ance. The contributions of the employees to re- 
cord output levels, to the reduction in costs and 
to the operation of new work methods, in other 
words the actions which facilitated the Com- 

‘pany’s survival and now expansion, were not in- 
cluded and evaluated alongside the presented 
data. Once again, the focus of managerial atten- 

. tion had been firmly located in the context of the 
relationship between the Company and environ- 
mental indices with micro-organisational con- 
siderations being clearly set aside. _ 

The twin central features of the new scheme, 
viz. the managerial discretion to determine di- 
vidend and hence bonus payments and the spe- 
cial financial treatment given to the monthly 
staff did not go unnoticed by the employee rep- 


resentatives. In May 1929 the committee was in- 


formed that it was considered: 


“wrong in principle that, in a year when the profit made 
may be 100% higher than in the previous year, the Fund 
maybe increased by as little as 25% or, maybe, not at all, 
particular as the workers have been given no rights of 
ownership in the profits withheid”. 


and 


“The Foremen’s and Shop Stewards section are of opinion 
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that a pro-rate distribution of the fund is the only equita- 
ble basis and would bave it apply from the bigbest paid 
to the lowest paid worker, that ts to say including Direc- 
tors and Managers, etc. at one end and juniors at tbe 
other, as all contribute in some measure to the profitabil- 
ity” (CM, 27.5.29, emphasis added). 


` After 9 years of attempts by management via the 


scheme to construct a consensus on the legiti- 


‘macy of existing or organisation structures, of 


key Company financial relationships and of the 
basis of managerial prerogatives such a state- 
ment might have been perceived by them as dis- 
appointing. 

In November 1930 the scheme was termi- 
nated when the Hans Renold Company merged 
with the Coventry Chain Company. 


CONCLUSIONS 


The Hans Renold profit-sharing scheme was 
conceived and designed by management as a 
means of strategically injecting accounting into 
the conduct of its industrial relations. In review- 
ing the emergence, roles and consequences of 
the scheme a number of key issues become appa- 
rent. 

Firstly, systems of accounting do not emerge 
in an organisational and environmental vacuum. 
Furthermore, rarely do they emerge by chance. 
Personal objectives and expectations, pre-exist- 
ing organisational structures and processes and 
environmental circumstances can all combine 
to provide a complex configuration of influ- 
ences impinging upon the derivation of particu- 
lar accounting systems. Indeed, this contextual 
framework will often prove critical in determin- 
ing the specific form of accounting to emerge. As 
such, attempts to understand the multiplicity 
and diversity of actual accounting practice must 
be sensitive to the local and idiosyncratic con- 
texts in which they are embedded. 

In the Hans Renold case, the introduction ofa 


- profit-sharing scheme in 1921, had roots in the 


Company’s traditional interests in employee 
welfare, joint consultation, scientific manage- 
ment and, of course, accounting. Furthermore, 
the personal beliefs of the Renolds were critical 
in the coupling of these strands and in the 
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mobilisation of them as a strategic response to, 


perceived problems and priorities. The injection 
of accounting into the industrial relations do- 
main was stimulated by an intent to confront and 
resolve a number of immediate Company con- 
cerns. The scheme’s structure and content 
reflected this, with the improvement of manage- 
ment—labour relations, enhanced managerial 
control over costs and work methods, and in- 
creased Company competitiveness all being 
issues addressed within the scheme. 

Secondly, the strategic mobilisation of ac- 
counting is not necessarily the product ofa sing- 
ular rationality. Accounting can become simul- 
taneously coupled to a variety of organisational 
priorities and to a variety of organisational struc- 
tures and processes. Moreover, the rationale of 
such couplings is based less upon any inherent 
or essential function for accounting than upon 
its perceived potentiality to offer a pragmatic 
contribution to the resolution of particular local 
problems. From such a perspective, idiosyncra- 
tic perceptions and expectations of the account- 
ing mission, what accounting is and what it can 
achieve, are of critical importance since they 
constitute not only the driving force underpin- 
- ning the. emergence of particular forms of ac- 

Counting but also influence the roles anticipated 
"for it. Clearly such considerations are of para- 

mount importance for the exploration and ap- 
preciation of the highly differentiated nature of 
the accounting craft. 

The strategic coupling of an accounting based 
joint consultative dialogue (and anticipated im- 
provements in managementiabour relations) 

_ to the introduction of tighter managerial con- 
trols might not on the face of it, appear the most 
natural association for management to form. It is 
‘apparent, however, both from C. G. Renold’s 
writings and indeed from the scheme itself that 
this was a relationship which to him seemed 
eminently feasible and potentially very useful. 


Accounting was to be simultaneously employed: 
to create the anticipated agenda and script for: 


management—labour discussions and to enhance 


managerial control. over workshop activities. 


The managerial commitment to this objective 


was sufficient for a profit-sharing scheme to be 
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devised which achieved a satisfactory level of 
technical integration for the underlying ac- 
counting system. A series of accounts and ac- 
counting measures were created which allowed 
management to forge linkages between corpo- 
rate performance, the operation of the control 
system and the profit-sharing scheme. 

Thirdly, if accounting systems are to be 
strategically placed by management in a number. 
of distinct, though overlapping, organisational 
arenas it would appear critical to monitor and as- 
sess their capacity to satisfy the expectations 
which stimulated their emergence. More specifi- 
cally, if accounting is perceived to possess 
various enabling and facilitative properties, en- 
couraging its mobilisation in a variety of local 
contexts for a number of diverse purposes, its 


performance needs to be evaluated on the basis 


of a correspondingly diverse set of measurement: 
criteria. It is essential that research is conducted 
sensitive to the need to disaggregate the func- 
tioning of accounting so that its particular suc- 
cesses and failures can be more clearly distin- 
guished. Furthermore, active and ongoing refor- 
mulation of the accounting mission as a result of. 
its operation and its progressive penetration of 
organisational life requires similar scrutiny. 

Accountings’ continuous involvement with 
the dynamic processes of organisational action 
suggests that it is likely that both the technology 
of accounting, its measures and descriptions and 
its impact on the expectations and beliefs of 
those who come into contact with it will be- 
come susceptible to adjustment and modifica- 
tion. 

The Hans Renold case clearly exhibits the im- 
portance of such considerations. The economic. 
progress of the Company during a period of hos-’ 
tile environmental restraint was considerable. 
Improvements in work methods, cost control, 
factory efficiency and competitive position were 
all noted by the profit-sharing committee. More- 
over, employee representatives on the commit- 
tee coupled these improvements to the labour 
collaboration secured by the incentive of profit- 
sharing. However, managerial aspirations for the 
scheme in the joint consultative arena were to 
prove far more problematic. C. G. Renold’s at- 
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tempt to forge a scientific basis for the preroga- 
tives and bases of managerial decision-making 
crumbled under the pressure of intense labour 
scrutinisation of managerial arguments. The use 
of outside contractors, the appropriate level of 
the wages of capital and the unaccountability of 
the “variable factor” were all issues debated by 
the committee and where the accounting 
calculus failed to convince the employee 
representatives of the scientific rigour of man- 
agerial decisions. Similarly, managerial attempts 
to use accounting principles and numbers as an 
educative medium, to explain to labour the re- 
strictions placed upon managerial behaviour 
and discretion by environmental priorities, were 
equally unsuccessful. The managerial promotion 
of the requirements of outside agencies of the 
bank and the financial market left labour uncon- 
vinced that changes to technical treatment of de- 
preciation and stock valuation were uncontami- 
nated by partisan aspirations. Indeed, that man- 
agement could change the entire basis of the 
scheme when Company priorities were deemed 
to necessitate it seemed to reinforce labour’s 
scepticism of this particular managerial argu- 
ment. So unsuccessful were the managerial aspi- 
rations for accounting in this context, that they 
sought to relegate the centrality of accounting 
numbers for the committee’s discussions and 
were ultimately forced to admit the often fragile 
foundations of their decision-making. 

Fourthly, the injection of accounting into the 
domain of the social is, however, more signifi- 
cant than the opportunity to monitor the per- 
formance of a technical tool being subjected to 
debate and evaluation as regards its calculative 
robustness. The strategic intertwining of ac- 
counting with various organisational structures 
and processes has important implications. When 
residing in particular local contexts accounting 

‘cannot be detached from the underlying powers 
and rights which underpin the contextual loca- 
tion of its emergence and operation. Further- 
more, accounting systems cannot be divorced 
from the hard tangible effects of organisational 
actions on those participants involved in its op- 
eration. Indeed, such considerations are likely to 
be highly influential in shaping the progression 
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of accounting in particular situations and, of 
course, for any significant appraisal of its per- 
formance. 

The Hans Renold profit-sharing scheme was 
initiated and designed by management. Further- 
more, all critical changes to its rules and proce- 
dures were the result of managerial imposition. 
These actions were made possible by a distribu- 
tion of organisation power which dictated such 
actions to be both desirable and functional. As a 
consequence the very accounting system itself 
— what was to be measured, how it was to be 
appraised and its implications for managerial 
action — reflected these underlying relation- 
ships. In addition, accounting was to become 
similarly implicated with other organisational’ 
phenomena dependent upon the power rela- 
tionships underpinning its context. Wage reduc- 
tions, redundancies and the lack of bonus pay- 
ments were all to intersect at various stages with 
accounting numbers and principles. Under such 
circumstances the mobilisation of accounting in 
the domain of the social takes on increased sig- 
nificance. 

The strategic managerial coupling of organisa- 
tional control, joint consultation and accounting 
numbers was constructed on the basis of a very 
clear philosophy as the social and industrial rela- 
tionship which were to guide its progress. How- 
ever, the very volatility and conflict which had 
contributed to the managerial desire to launch 
such a programme was not susceptible to such 
convenient resolution. More significantly, ac- 
counting numbers and principles having been 
injected in the very centre of such considera- 
tions became the focal point and vehicle for the 
articulation of wide ranging disagreements and 
labour challenges. Whilst on occasions these 
focused on the technical inability of accounting 
to provide unambiguous measures of economic 
performance, they were progressively to de- 
velop and ultimately to incorporate the very 
foundations of organisational relationships. 
Labour’s continual claims that the wages of capi- 
tal were too high, that shareholders were over 
protected by the scheme, that labour had a prop- 
rietary interest in corporate profits and even that 
all organisational participants by they directors 
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or juniors should be equally rewarded by the 
scheme reflected their concern with the very 
basis of organisational relationships and the 
rights and privileges of different classes of or- 
ganisational participants. Labours’ utilisation of 
their local perceptions of factory life and ac- 
tivities provided them with a source of knowl- 
edge which constituted a direct challenge to the 
descriptions provided by accounting. Indeed, 
labour’s critical experience of, and personal in- 
volvement with, factory events and changes 
seemed to provide them with a source of knowl- 
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edge which they elevated above that generated 
by the formal accounting system. The comment 
that accounting functions in complex and dy- 
namic social situations can possess a glibness 
which almost undermines its real significance. In 
the Hans Renold case, no appreciation of the 
emergence, roles and consequences of account- 
ing would be possible without a thorough 
scrutinisation of the social, economic and politi- 
cal relationships which underpinned its contex- 
tual location. 
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Abstract 


This paper examines the impact of accounting information on the sequential judgments of experienced 
bank loan officers using realistic lending cases in an experimental setting. The findings suggest that loan 
officers reach a high level of confidence early in the lending process based on summarized accounting 
information and other general background data. When, later in the process, factors concerning the firm's 
financial plans and their underlying assumptions are varied, lenders adjust their confidence in whether or 
not to grant the loan in the expected directions, even when the subsequent evidence disconfirms their 


original positions. 


Accounting information is used in many busi- 
ness decisions, including the very important 
area of bank lending. Even though financial 
accounting is said to be developed to assist ex- 
ternal users in their business decisions, with the 
two primary external user groups identified as 
investors and creditors (FASB, 1978), there is 
very little empirical work that examines lending 
decisions and how creditors process accounting 
data, ; 

We examine the impact of accounting infor- 
mation on the sequential judgments of bank loan 


officers using realistic condensed lending cases. 


in an experimental setting. The study reflects a 
high degree of external validity, and provides 
new insights into how and when various types of 
accounting information are used in banks’ 
credit-granting decision process. 

The results of our research suggest that 
lenders reach a high level of confidence in their 
credit-granting decisions very early in the infor- 
mation gathering and evaluation process. This 
initial confidence is based on summarized 
financial accounting data and other background 
information. At the same time, lenders’ confi- 
dence in their judgments at subsequent stages of 
the information gathering and evaluation pro- 





cess are materially altered by more detailed 
forms of accounting data. Even in cases where 
subsequent information disconfirms prior judg- 
ments, lenders respond by revising their confi- 
dence in the credit-granting decision consistent 
with the nature of the subsequent evidence. 
While this result.is consistent with the common 
finding that strong initial confidence is obtained 
early in the decision process, it is inconsistent 
with the many studies that find disconfirming 
evidence not to be searched for and/or to be ig- 
nored once a strong initial impression has been 
formed. 

The study provides evidence that accounting 
data have a material impact on lending decisions, 
and that summarized basic financial statement 
data, collected independently of the client bor- 
rower early in the process, have the greatest im- 
pact on their decisions for the class of borrower 
examined in this study. These results point to a 
previously unexplored source of data for future ` 
investigation into accounting’s role in the credit- 
granting process. 

The next section of the paper discusses the 
nature of the credit-granting process, and the 
subset of credit-granting decisions examined in 
this study. The third section of the paper 


*The authors appreciate the financial support of the Touche Ross Foundation, which made this study possible. We appreciate 
the assistance of Carol Frost, Jaysri Sankaran and Jacob Thomas. 
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describes the research design and hypotheses, 
with the results and analysis reported in the 
fourth section. The final section summarizes the 
results and implications for future research. 


THE LENDING ENVIRONMENT 


Synopsts of the lending process 

The first step in our project was to learn as 
much as possible about when and how account- 
ing information is actually used in the credit- 
granting process. Based on structured but open- 
. ended interviews with chief lending officers at 
three major banks in Detroit, two major banks in 
Chicago and three large San Francisco banks, we 
found the following generalizations to hold. 

1. For small borrowers (e.g. sales of less than 
$10—$15 million) the quality of the accounting 
data is very inconsistent, sometimes of limited 
use to lenders. Financial plans are generally un- 
available. Bank loans are an important source of 
capital for these borrowers. 

2. For medium size borrowers (e.g. sales be- 
tween $20 and $40 million) the accounting in- 
formation is generally of acceptable quality and 
very important to the credit granting process. 
Financial plans are frequently available and of 
use to lenders. Bank loans are an important 
source of capital for these borrowers. 

3. For large borrowers, the accounting infor- 
mation is normally of a high quality, usually au- 
dited, and often more detailed than what is re- 
ported to investors in annual reports. Bankers’ 
ability to obtain all accounting data may be li- 
mited by the borrower. However, detailed non- 
published data are often not necessary for len- 
ders to evaluate credit applications. These bank. 
loans are frequently a small part of the capital re- 
quired by large borrowers, who use public debt, 
leases, etc., as their major sources of capital. 

Based on this analysis, we determined that the 
medium sized borrowers would be of greatest 
interest for this study. The inconsistency in smal- 
ler borrowers caused research design and valid- 
ity problems we chose not to deal with, while 
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large borrowers might be better evaluated using 
publicly available data (e.g. analysis of bond rat- 
ings) rather than the experimental approach 
used here. 

Focusing our attention on medium sized bor- 
rowers, we interviewed lenders to identify those 
aspects of the sequence of data gathering and 
analysis that were common to the evaluation 
process across lending institutions. We focused 
on the lending decision for a new client. This 
situation provided the most unambiguous set- 
ting for evaluating the complete sequence of 
steps in the lending process. Decisions regarding 
existing borrowers were based heavily on track 
record and interpersonal relationships, making 
it difficult or impossible to isolate the effect of 
accounting information and creating serious in- 
ternal validity problems. 

We found the sequence of the lending deci- 
sion process for new clients to be relatively:easy 
to standardize, enabling generic lending cases to 
be developed around the following three phase 
process. 

Phase 1. Examine publicly available data on 
potential borrower to make a preliminary judg- 
ment on the quality of the proposed loan. 

Phase 2. Make personal contact with prospec- 
tive borrower, normally at borrower’s place of 
business, to size up the borrowet’s operations 
and future financial and operating plans. 

Phase 3. Perform detailed credit analysis and 
evaluation of historical and forward-looking 
financial data to determine the likelihood of a 
successful loan. 

These three phases of the credit-granting deci- 
sion provide a framework for evaluating lenders’ 
judgments as a sequential process. Detailed 
financial accounting data are the most dominant 
part of the information set in phase 3, where cre- 
dit analysis is performed. However, rather than 
confining the experiment to manipulations of 
phase 3 accounting data, our experiment investi- 
gates the entire sequence to determine the rela- 
tive importance of information at all three 
phases.' We note that while accounting data are 


‘One drawback with generic cases focusing on phase three manipulations alone was the inconsistent role of the lending 
Officer in the credit analysis. Lending officers at some banks do their own credit analysis while other banks aaa separate 


credit analysts who work with or for lending officers. 
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the dominant ingredient in phase 3, important 
accounting data are introduced at all three ph- 
ases of information gathering and evaluation. 


Details of lending phases 

Phase one of the credit evaluation process in- 
volves gathering background information on a 
prospective borrower, including a brief history 
of the company and a credit report. This back- 
ground data includes highly summarized 
financial report information, such as a credit rat- 
ing, details about location(s), number of 
employees, when the company began, names of 
key officers, and so on. When current, the credit 
reports from financial services such as Dun & 
Bradstreet (D & B) are considered by lenders to 
be the most useful source of information in 
phase 1. These credit reports contain data from 
financial statements voluntarily submitted to the 
credit service by the most closely held com- 
panies. The D & B report also indicates whether 
the statements are audited, reviewed or com- 
piled, the name of the CPA firm (if any), any un- 
usual features of the data, and other pertinent in- 
formation. The combination of the credit re- 
ports and the lender’s knowledge of the client’s 
Officers, major owners, lines of business and in- 
dustry position (facilitated by such sources as 
Robert Morris Surveys) provide the background 
data included in phase 1. 

Phase 2 of the process involves a personal “siz- 
ing up” of the borrower’s business prospects and 
managerial skills. This phase normally includes 
an on location visit by the lender. Any back- 
ground data missing from what is normally col- 
lected in phase 1 is usually collected during 
phase 2. Also, phase 2 normally contains a dis- 
cussion with the client as to the purpose of the 
loan, their future operating plans and their 
financial plans for the repayment of the loan. 

Detailed historical and forward-looking 
accounting data are normally either collected by 
the lender during their visit with the borrower, 
or else forwarded to the bank soon after the visit. 





237 


These accounting data help lenders to assess the 
need for the loan, its purpose, and the quality of 
the borrowers organization. In conjunction 
with the interpersonal aspects of the visit these 
financial accounting data provide important sig- 
nals to lenders regarding the borrower's produc- 
tion capabilities, financial planning and control 
procedures, and the overall integrity of the man- 
agement team. The availability of detailed 
operating and financial data during the visit is an 
important signal to lenders in assessing manage- 
ment’s financial preparedness. 

In phase 3 the lender performs detailed 
evaluations of the historical and forward-looking 
accounting data. This process is designed to de- 
termine whether the forward-looking data pro- 
vide for the successful repayment of the loan 
from future operations, and whether the under- 
lying assumptions about the future along with 
historical facts from past performance are suffi- 
ciently supportive of such a plan. The phase 3 
analysis, along with the other data collected in 
earlier phases culminates in a decision to grant 
or not grant the proposed loan. The decision is 
made by an individual or a group, depending on 
the lending officer’s authorized lending limit and 
other bank policies.” 


Factors affecting lending 

We expected loan officers to develop high 
levels 6f confidence in lending judgments early 
in the process, but also to modify their confi- 
dence in response to signals they receive during 
the process. 

Judgment research has shown that in settings 
lacking clear and quick outcome feedback, 
decision-makers display extreme and inapprop- 
riate confidence in the quality of their judg- 
ments (Einhorn & Hogarth, 1978). Oskamp. 
(1965) demonstrated that such over-confidence 
occurs even among experienced professionals 
making routine professional judgments. In addi- 
tion, once settled on a hypothesis, judges typi- 
cally do not consider evidence that is discon- 


For example, a given lending officer might have a limit of $200,000, meaning he or she can approve loans up to $200,000 
without committee approval, but is required to make a presentation to a committee for loans in excess of $200,000. Within 
the same bank there may be some officers with, say, $1,000,000 limits, while others may have $250,000 limits, and so on. 
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firming or consistent with alternative hypoth- 
eses (Elstein etal., 1978; Lord etal., 1979; Koriat 
` etal., 1980). 

However, we believe that bank loan officers 
work in a decision-making environment that fos- 
ters a different pattern of confidence modifica- 
tion. First, they must anticipate defending their 
loan decisions before a loan committee. Such a 
setting has been shown to enhance people’s 
memory of cues and to incorporate more cues in 
their judgments (Ebbesen & Konecni, 1975). 
Loan officers encounter a wide variety of loan 
situations both personally and through reports 
of colleagues (Holt, 1984). This may allow them 
to imagine alternative hypotheses and, as 
Einhorn & Hogarth (1982) posit, reduce the 
credibility of a favored hypothesis. Moreover, 
there is evidence that expert decision-makers 
who must justify their judgments are able to de- 
tect and respond to subtle variances in routinely 
analyzed data (Danos et al., 1984). 


EXPERIMENTS AND HYPOTHESES 


Tbe decision variable 

The three phases of the information-gathering 
and evaluation process provide a framework 
within which lenders’ sequential judgements are 
examined. The decision variable of interest eli- 
cited from lenders at each stage in this informa- 
tion gathering and evaluation process is the con- 
fidence in their judgment that the loan would be 
granted or not granted. Our search of the litera- 
“ture revealed that very little work had been done 
on the lending decision, and that the decision 
variable used on some widely referenced prior 


work focused on the interest rate as the decision : 


variable (Libby, 1979). However, our interviews 
with bankers revealed that interest rate is not of 
great importance during the decision process. It 
seems that interest rate is more an institutional/ 
market condition variable rather than one consi- 
dered by individual loan officers in assessing po- 
tential borrowers. The rate of return for a loan is 
affected by interest rate and other factors such as 
the mix of other services a client might commit 
to in conjunction with a loan. 


PAUL DANOS et al. 


Alternatively, it is very realistic to assume that 
lenders have a feel for how likely it is that a pro- 
posed loan will be granted at any stage of the in- 
formation-gathering and evaluation process. 
That the process has three fairly distinct and uni- 
form phases to it enhances their ability to pro- 
vide interim judgments at these three points in 
the process. Given that our interest in this exper- 
iment is to learn more about the relative impor- 
tance of the information found in the three diffe- 
rent phases, we evaluate changes in confidence 
in the grant/not grant decision after each of the 
three stages. 

In order to compare bankers’ decisions across 
lending institutions and lenders, we examine 
lenders’ confidence in their decision to grant or 
not grant a proposed loan by recording three 
separate observations for each case, we allow ` 
each subject’s change in confidence to be the 
measurement of interest. Also, at the conclusion 
of each case we elicit the lender’s subjective 
probability that the loan described in the case 
will be “fully serviced”. 


Experimental instruments 

Many different variables included in the three 
phases of data collection and analysis were can- 
didates for manipulation. From the set of realis- 
tic possibilities, we chose to evaluate several 
dimensions considered to be important in the 
lending process. Our discussions with bankers 
suggested that their decision processes were in- 
fluenced by risk class. As a result, we developed 
two corporate cases, one a financially strong and 
one a financially weak case, for use in the experi- 
ments. Once the basic financial risk level was es- 
tablished, a large number of potential manipu- 
lated variables became “fixed” in order to main- 
tain the overall realism of each case. For 
example, historical financial accounting data are 
a key element in risk assessment, and the man- 
ipulation of these historical data would not be 
possible if they changed the inherent risk man- 
ifested in the data. . 

A second manipulated variable was the initial 
availability of the prospective financial data dur- 
ing the phase 2 visit to the borrowers. While all 
borrowers could be expected to have historical 
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_ data on hand, only those who had prepared for- 
ward-looking data consistent with a feasible 
term loan would be ready to provide such data 
during the visit. The prospective accounting 
data provided a signal of management's planning 
competence and the degree of attention they 
had devoted to how the proposed loan would be 
employed and repaid. 

The final manipulated variable also came from 
the forward-looking accounting data. Even in 
cases where forward-looking data are not ini- 
tially available at the first visit, we assumed that 
such data would be presented to the bank before 
the lending process advanced to phase 3. The 
nature of the assumptions underlying the for- 
ward-looking operating results was able to be 
realistically manipulated without conflicting 
with the underlying riskiness implied by the his- 
torical data. The assumptions motivating the 


Phase 1 Phase 2 


Financially strong case 
with background 


Information including 
current D & B report 





Management financial plan is 
not available during visit 


Management financial plan 
is available during visit 


Financially weak case 
with background 


information Including 
current D & B report 





Management financial plan is 
| ‘not available during visit 


Management financial plan 
is available during visit 
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prospective accounting data were of great in- 
terest to lenders in their assessment of the suc- 
céssful payback of the loan. 

The variations examined in phase 1 (P1), 
phase 2 (P2), and phase 3 (P3) of the lending 
process are outlined in Fig. 1. The accumulation 
of data implicit in this sequencing is consistent 
with the decision process described by lending 
officers except that, for experimental purposes, 
we employ discrete elicitation points. The actual 
decision may be continuous. 

The experiment was designed primarily to 
examine how confidence is formed and mod- 
ified in a lending situation. This was accom- 
plished by measuring how the lenders’ confi- 
dence changed over time and how variations in 
accounting data affected their confidence. 

We manipulated the risk variable as a within- 
subject treatment so that each subject received 


Phase 3 


Assumptions of plan are 
well-grounded 
Assumptions of plan are 
not well-grounded 
Assumptions of plan are 
well-grounded 
Assumptions of plan are 
not well-grounded 
Assumptions of plan are 
well-grounded 
Assumptions of plan are 
not well-grounded y 
Assumptions of plan are 
well-grounded 
‘Assumptions of plan are 
not well-grounded 


Fig. 1. Variations in information available to lenders in a typical lending setting. 
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both the financially strong and the financially 
weak cases. The risk level was established by 
using summarized historical financial data and 
their corresponding actual risk “rating” as pre- 
pared by Dun & Bradstreet for two actual com- 
panies. The accounting data used as treatments 
in the second and third stages of the evaluation 
process for each case were introduced on a 
between-subject basis. Each subject received 
the same level of the phase 2 and phase 3 treat- 
ments for both cases. Each subject provided 
three responses concerning their confidence in 
the grant/not grant decision for each of the two 
cases, one response after each phase. The re- 
sponses called for the lender to predict on the 
basis of the data received to that point — after 
phase 1, after phase 2 and after phase 3, whether 
or not the loan would be granted, and then to in- 
dicate how confident they were that their pre- 
diction would be the correct one in 100 identi- 
cal cases. The repeated measurements taken 
after introducing phase 2 and phase 3 provided 
an indication of the change in their confidence 
attributable to the new information in phase 2 
and phase 3.° 

At the conclusion of each case, we asked each 


subject to evaluate the probability that, if: 


granted, the loan described in the case would be 
fully serviced. This follow up question provided 
a check of our primary elicitation concerning 
their confidence in the grant/not grant decision, 
since confidence should be associated with their 
belief that the loan would be fully serviced. 


Subjects and administration , 

The experiments were administered to 52 
subjects at five major midwest and west coast 
banks. The lenders participated in groups of 
from 2 to 15 subjects in their respective bank’s 
facilities, and were randomly assigned to treat- 
ment groups. The time to complete the experi- 
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ments ranged from 65 to 95 minutes, with an 
average time of 75 minutes. Information pro- 
vided to the subjects was closely controlled by 
the researchers, and none of the experimental 
materials were revealed to the participating 
banks until administration of the experiments. 

We administered the experiments in person, 
providing a brief and uniform introduction and 
then remained at hand to answer any questions.‘ 
Neither the order of presentation nor the cases 
themselves appeared to affect the time spent, 
with most subjects spending approximately 
equal time on the two cases. Very few questions 
were asked once the practice set materials were 
completed. 


Testable hypotheses 

The experiment was designed to permit us to 
evaluate the impact of the three manipulated 
variables on lending judgments. On the basis of 
our fieldwork, we believed that the risk level 
would be a significant factor, but we were uncer- 
tain as to how it might interact with the other 
data provided. 

In addition, we were interested in the pattern 
of responses for each case. How confident would 
lenders be after the first phase? As they received 
additional information in the second and third 
stage, how would their confidence in their judg- 
ment change? Would judgments change from 
grant to not grant or vice versa? Would the de- 
tailed financial reports analyzed in phase 3 sig- 
nificantly increase lenders’ confidence? Is a 
higher level of confidence attained earlier in the 
process with nonrisky borrowers versus risky 
borrowers? By observing the patterns of re- 
sponses we expected to gain insight into these 
previously unexplored issues. 

We expected high confidence levels begin- 
ning at phase 1 as consistent with previous 
studies. However, the fieldwork also helped us | 


‘Samples of the experimental materials are available from the authors upon request. 


‘The researchers presented a brief verbal introduction and worked through a “practice set” with the subjects to familiarize 
them with the experimental instruments and judgment elicitation, answering any questions concerning the task during the 
practice session. This introduction took between 8 and 13 minutes to complete, depending on the presenter and the 
questions. Once the subjects were comfortable with the experimental instruments, they were free to work through the two 
(randomly ordered) cases and the “concluding questions” at their own pace. 
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develop additional expectations at variance with 
the previous findings that subjects fail to search 
for or to use disconfirming evidence. Lenders 
told us that the availability of forward-looking 
accounting data signaled management planning 
competence, as did the quality of the assump- 
tions upon which the data were based. We ex- 
pected, therefore, that the confidence of our len- 
der subjects would be modified consistent with 
the direction of these variables. In other words, 
the availability of well grounded forward-look- 
ing accounting data (a positive signal ) would in- 
crease their confidence in a “grant” decision and 
decrease their confidence in a “not grant” deci- 
sion. 


RESULTS 


Our results are reported in two parts. The first 
deals with the bankers’ sequence of responses 
concerning their confidence in the grant/not 
grant decision elicited at three point in each 
case. The second part deals with the bankers’ 
evaluation of the likelihood that a loan like the 
one described in the case would be fully ser- 
viced if it were granted. 


Results of confidence experiments 

The subjects were nearly unanimous about 
the basic grant/not grant decisions. For the finan- 
cially weak case, 48: of 52 bankers judged that 
the loan would not be granted after the initial 
data revealed in phase 1, with one additional 
banker changing from “grant” to “not grant” after 
phase 3. For the financially strong case 47 of 52 
subjects initially judged that the loan would be 
granted, with three more switching to “grant” 
during the experiment. 

As expected, lenders’ overall level of confi- 
dence in their judgments increased as additional 
information was provided. Table 1 shows the 
overall mean level of confidence after each of 
the three phases of information’ was revealed in 
the two cases (without regard to other manipu- 
lated variables ). 

For the financially weak case, the initial level 
of confidence in their judgments after phase 1 
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TABLE 1. Mean confidence in judgements 
(all subjects, all treatments) 





Financially Financially 
strong case weak case 
(Grant the loan) (Not grant the loan) 


72% 





-After phase 1 where 74% 
background data is 


obtained 


After phase 2 where 
the visit is made to 
the client’s offices and 
forecast is either avail- 
able or not available 


76% 


After phase 3 where the 
the financial statements 
and forecasts are : 
obtained. The forecasts 
have either high or 

low quality assumptions 


82% 83% 





was quite high (72), suggesting that a great deal 
of information is obtained even before the len- 
der meets with a prospective borrower. The 
meeting and information revealed in phase 2 is 
associated with an increase in confidence for the 
financially weak case, from 72 to 77 or' +5, while 
the detailed analysis of historical and forward- 
looking financial data in phase 3 resulted in a 
slightly greater increase in confidence (from 77 
to 83 or +6). For the financially strong case, the 
subjects initial average confidence in their judg- 
ments was somewhat higher than for the finan- 
cially weak case (74 vs 72), providing evidence 
that lenders have a great deal of confidence in 
their expected lending judgments prior to ever 
meeting with a new client. The signals from the 
meeting in phase 2 had only a moderate positive 
impact on average confidence (from 74 to 76 or 
+2), while the detailed historical and forecasted 
financial data once again had a great effect on 
average confidence (from 76 to 82 or +6). On 
average, lenders predicted after phase 3 that in 
82 of 100 cases like the financially strong case 
the loan would be granted. 

We note that the data in phase 1 contained 
highly summarized financial data while phase 3 
revealed detailed financial statement data. Based 
on the high initial confidence levels after phase 1 


242 


and the significant increase in confidence after 
phase 3, we suggest that financial statement data, 
both historical and prospective, play a pivotal 
role in the lending decision process. The results 
are consistent with the premise that both highly 
summarized financial statement data from inde- 
pendent sources and detailed historical and 
prospective financial statement data and 
financial ratios normally examined in phase 3 
contain information that is useful to lenders. 

Of particular interest was the relatively high 
level of initial confidence in the lending judg- 
ment after phase 1 alone. This outcome provides 
new evidence concerning the importance of 
basic financial statement data and other facts 
prepared by independent services and purch- 
ased by lenders as a primary source of initial 
information about a prospective client. The im- 
portance of the information and analysis pro- 
vided by these services should not be under- 
stated, The analysis of financial statements of 
closely held companies provided by services 
such as Dun & Bradstreet may influence lending 


decisions as much as bankers own direct analysis _ 


of detailed financial statements. It seems likely 
that other research might find Dun & Bradstreet 
analysis useful in evaluating economic aspects of 
smaller closely held businesses. 

Within the overall results reported in Table 1, 
we examined the effect of two manipulated 
accounting variables that provided important 
information to lenders: initial availability of for- 
ward-looking accounting data (A) and the qual- 
ity of the assumptions used to prepare the for- 
- ward-looking data (Q). Tables 2 and 3 report the 
ANOVA results of the experiment for the two 
cases separately.’ The ANOVA results for be- 
tween-subject manipulations of A and Q are re- 
ported separately from the within-subject ef- 
fects concerning the time sequence of confi- 
dence measures (designated T for timing ef- 
fects) and their interactions with A and Q. The 
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results are quite different for the two cases. The 
manipulated variables (A and Q) tend to have 
less of a systematic effect on the sequential con- 
fidence judgments in the financially weak case 
(Table 2) when compared to the financially 
strong case (Table 3). While the manipulation 
for the quality of the forecast assumptions (Q) 
introduced in the final set of information (phase 
3) was important in both cases (p = 0.07 in 
Table 2 and p = 0.00 in Table 3 ), the initial avail- 
ability of the forecasts (A) introduced in phase 2 
was only significant in the financially strong case 
(Table 3, p = 0.00). The mean changes in confi- 
dence reported at the top of Tables 2 and 3 
reveal similar patterns. With the exception of 
one change in Table 3, lenders responded to the 
additional information with an increase in confi- 
dence. The exception occurred in the financially 
strong case when lenders discovered low quality 
forecast assumptions in phase 3, the final infor- 
mation set, after finding forecasts available dur- 
ing their visit. In response to this combination of 
treatments lenders reduced their average confi- 
dence that the loan would be granted. 

The pattern of changes in confidence for the 
financially strong case generally reveals “good 
news” levels of manipulated variables having a 
greater confidence-increasing effect on lenders 
GRANT predictions than “bad news” manipula- 
tions. The opposite is true for the financially 
weak case, where “bad news” manipulations in- 
crease lenders’ confidence in their NOT GRANT 
predictions more than “good news” manipula- 
tions. 

The overall patterns revealed in Table 1 are 
augmented by the results reported in Tables 2 
and 3, which show the effects of confirming and 
disconfirming levels of information within the 
phase 2 and phase 3 data sets. We see that initial 
availability of forward-looking data from phase 2 
has less impact on lender confidence than does 
the quality of the assumptions provided in phase 


*This simplifies the analysis. ANOVA results with the two cases as a within-subject variable do not alter the conclusions or 


the strength of the results. 


‘Recall that the information provided in each of the last two phases includes more than the manipulated variable. Only if the 
manipulated variable is relatively important and of a disconfirming nature would we expect a decline in lenders’ average 


confidence concerning the grant/not grant decision. 
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TABLE 2. Financially weak case 
(A) (Q) 
Quality Mean change in confidence 
Forecast of Subject Phase 1— Phase 2~— 
availability assumptions . group phase 2 phase 3 
High 1 2.69% 0.77% 
Initially 
available 
Low 2 6.15% 9.00% 
High 3 6.54% 4.54% 
Not initially 
available 
Low 4 4.62% 8.62% 
Analysis of variance 
Between-subject sS DF MS F Tail probability 
Availability of forecast 
in phase 2(A) 67.85 1 67.85 0.68 0.42 
Quality of forecast 
assumptions in phase 3(Q) 347.12 1 347.12 3.46 0.07 - 
AXQ 124.96 1 124.96 1.24 0.27 
Error 4817.92 48 100.37 
Within-subjects: 
Timing (7) effect of first 
and second confidence 
measures 7.54 1 7.54 0.07 0.79 
TXA 0.15 1 0.15 0.00 0.97 
TXQ 162.50 1 162.50 1.51 0.23 
TXAXQ 0.35 1 0.35 0.00 0.96 
Error 5169.46 48 107.70 





3. For the financially weak case, even as lenders 
approached high levels, of confidence that the 
loan request would not be granted, their average 
confidence in the not grant decision increased at 
a much slower rate when presented with high 
quality assumptions. Our fieldwork led us to ex- 
pect that both detailed historical data and for- 
ward-looking data would be important elements 
in lending decisions for companies similar to our 
cases. However, the quality of the assumptions 
concerning future accounting performance was 
the important accounting attribute normally 





found in the phase 3 data that offered a realistic 
variable to. manipulate without contradicting 
the historical financial data provided in phase 1.’ 
The importance of forward-looking accounting 
data is well documented in the accounting liter- 
ature (e.g. Brown et al., 1985), with the predic- 
tive role of accounting data established as an ob- 
jective by the FASB’s Conceptual Framework: 
(FASB, 1978). Here, we find empirical evidence 
supporting the usefulness of forward-looking 
accounting data in a major class of credit-grant- 
ing decisions, lending further. credence to their 


The detailed historical data in phase 3 needed to be consistent with the earlier data (from phase 1) to maintain realism since 
Dun & Bradstreet reports are rarely at variance with the detailed financial statements. Likewise, the forward-looking data 
were effectively required for the class of borrowers examined here, with the availability of these data (in phase 2) and the 
assessed quality of the underlying assumptions (in phase 3) representing important and realistic attributes that were capable 


of manipulation without confounding. 


244 


PAUL DANOS et al. 



































Quality Mean change in confidence 
Forecast of Subject Phase 1 — Phase 2— 
availability assumptions group phase 2 phase 3 
High 1 4.62% 9.23% 
Initialty : 
available 
Low 2 1.69% (7.00% ) 
High 3 0.38% 12.31% 
Not initially 
available 
Low 4 0.92% 9.69% 
Analysts of variance . 
Between-subject: SS DF MS F Tail probability 
Availability of forecast 
in phase 2 (A) 354.46 1 354.46 8.86 0.00 
Quality of forecast 
assumptions in phase 3 (Q) 732.46 1 732.46 18.30 0.00 
AXQ 473.88 1 473.88 11.84 0.00 
Error 1911.15 48 40.02 
Within-subjects: A 
Timing (7) effect of first 
and second confidence 
measures 448.62 1 448.62 6.17 0.02 
TXA 996.96 1 996.96 13.70 0.00 
TXQ 440.35 1 440.35 6.05 0.02 
TXAXQ 167.54 1 167.54 2.30 - 0.14 
Error 3492.54 48 72.76 
information content. treatments regarding forward-looking account- _ 


Ltkelibood of fully serviced loan 

The second set of results supported the effi- 
cacy of our primary tests. These results concern- 
ing the likelihood that the loan in each case will 
be fully serviced are analyzed in a 2 X 4 ANOVA 
reported in Table 4. The two variables, risk (at 
two levels) and treatment concerning forward- 
looking data (at four levels equal to the four 
combinations of treatments A and Q in Fig. 1) 
were both significant. The loan for the finan- 
cially weak case was perceived as much less 
likely to be fully serviced than that of the finan- 
cially strong case loan which speaks to the per- 
ceived quality of the two loans. Although the risk 
level was certainly dominant in terms of 
explanatory power, the four combinations of 


ing data were also significant at the 0.05 level. 
Note that the accounting treatments are still a 
between-subject manipulation here. This result 
provides further evidence of the importance of 
prospective accounting information in lenders’ 
judgments of the overall quality of the loan. 


SUMMARY AND CONCLUSIONS 


These experiments, conducted with experi- 
enced lending officers, suggest that the manipu- 
lated variables were influential in the context of 
the decision tasks examined. The perceived risk 
level of the borrower had a significant impact on 
how subsequent information was apparently 
used. Also, we note that average lender confi- 


rl. 
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TABLE 4. Likelihood that Ioan will be fully serviced 




















(A) (Q) 
Quality Mean likelihoods 
Forecast of Subject Financially Financially 
availability assumptions group strong case weak case 
High 1 87% 38% 
Initially 
available 
Low 2 78% 24% 
High 3 73% 28% 
Not initially 
available 
Low 4 82% 23% 
Analysis of variance 
Between-subject: SS DF MS F Tail probability 
Forecast effects (A and Q) 2751 3 917 2.27 0.05 
Subjects within group 
(error) 16,201 48 318 
Within-subjects: 
Risk (financially strong 
or weak case) 68,188 1 68,188 200.27 0.00 
Risk x forecast effects 680 3 227 0.67 0.58 
Risk X subject within 
groups (error) 16,343 48 340 


dence after seeing the initial (premeeting) data 
for both cases was quite high, and the data pro- 
vided subsequently in phase 2 and phase 3 
seemed to have an overall confirming effect on 


_ the lenders’ confidence in their grant/not grant 


predictions. These results suggest that lenders 
reach a very high level of confidence early in the 
process based on highly summarized financial 
accounting information and other general back- 
ground data. Further, while lenders continue to 
register significant increases in confidence as 


additional data are received, seldom does sub- . 
sequent information cause them to alter theirin- `~ gations of precisely how these services use 


itial grant/not grant judgment. Still, they react 
significantly in the expected way to incremental 
data which confirm or disconfirm their prior be- 
liefs, as evidenced by the impact of the manipu- 
lated accounting variables. 

Several conclusions can ‘be drawn from this 
study. First, it confirms the overall importance of 


. historical and forward-looking accounting infor- 


mation in the lending decision process. These 


accounting data are typically obtained at two 
points: for new clients: first from independent 
credit-rating services in a summarized yet infor- 
mative financial profile; and’ second, from the 
clients themselves. Both sources of historical 
data appear to be important. Since the historical 
accounting data independently processed by ` 
credit-rating services appears to have a signifi- 
cant early impact on the credit-granting decision 
process, and because the use of these services is 
pervasive among lending institutions, a great 
deal of insight might be gained by further investi- 


GAAP financial statements, and how they dif- 
ferentiate between GAAP and non GAAP ‘ac- 
counting procedures in performing their credit 
evaluation service. - 

This evidence is consistent with the conten- 
tion that prospective accounting data are used as 
signals of management planning competence. 
Providing well-grounded, forward-looking data 
for lender examination seems to signal credit 
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worthiness. The results also tell us something mation is processed, even where the signals dis- 
about how lenders make decisions. Experienced confirm their prior positions. 
‘lenders do modify their confidence as new infor- ~ 


BIBLIOGRAPHY 


Brown, P., Foster, G. & Noreen, E., Security Analysts Multi-Year Earnings Forecasts and the Capital 
Market (Sarasota, FL: American Accounting Association, 1985). 

Danos, P., Holt, D. & Imhoff, E., Bondraters’ Use of Management Financial Forecasts: An Experiment in 
Expert Judgment, The Accounting Review (October 1984) pp. 547-573. 

Einhorn, H. & Hogarth, R., A Theory of Diagnostic Inference: Imagination and the Psychophysics of 
Evidence, working paper (June 1982). i 

Einhorn, H. & Hogarth, R., Confidence in Judgment: Persistence of the Illusion of Validity, as as 
Review (September 1978) pp. 395—416. 

Ebbesen, E. & Konecni, V., Decision Making and Information Integration in the Courts: The Setting of Bail, 

Journal of Personality and Social Psycbology (1975) pp. 805-821. 

Elstein, A., Shulman, L. & Sprafka,-S., Medical Problem Solving: an Analysis of Clinical Reasoning 
(Cambridge, MA: Harvard University Press, 1978). 

Financial Accounting Standards Board, Statement of Financial Accounting Concepts No. 1: Objectives of 
Financial Reporting by Business Enterprises (Stamford, CT: FASB, 1978). 

Holt, D., Evidence Integration in the Formation of Risk Assessments by Auditors and BankHending Officers, : 
dissertation, The University of Michigan (1984). 

Koriat, A., Lichtenstein, S. & Fischhoff, B., Reasons for Confidence, Journal of Experimental Psychology: 

` Human Learning and Memory (March 1980) pp. 107-118. 

Libby, R., The Impact of Uncertainty Reporting on the Loari Decision, Supplement to Journal of Accounting 
Research (Spring 1979) pp. 35—57. 

Lord, C G., Ross, L & Lepper, M. R, Biased Assimilation and Attitude Polarization: The Effect of Prior 
Theories of Subsequently Considered Evidence, Journal of Personality and Social Psychology 
(November 1979) pp. 2098-2109. 

Oskamp, S., Overconfidence in Case Study Judgments, The Journal of Consulting Psychology (1965) 
pp. 261—265. 


` 


Accounting, Organizations and Society, Vol. 14, No. 3, pp. 247-258, 1989. 


Printed in Great Britain 


036 1-3682/89 $3.00+.00 
Pergamon Press plc 


BENEFIT—COST ANALYSIS AND RESOURCE ALLOCATION DECISIONS* 


LAWRENCE A GORDON . 
University of Maryland, College Park, U.S.A. 


Abstract 


Concern over the U.S. federal government's deteriorating infrastructure and large budget deficits has 
recently resulted in much attention being focused on the subject of federal capital expenditures 
(investments), At the same time, the Office of Management and Budget has recently taken steps to 
aggressively pursue a policy of requiring federal government managers to use benefit-cost (B-C) analysis 
techniques in preparing requests for certain types of capital expenditures. More specifically, OBM now 
requires the use of benefit—cost analysis techniques for major initiatives concerning the acquisition of 
information technology’ systems. In this paper it is argued, and empirically verified, that the federal 
government's use of B-C analysis techniques affects the resource allocation process differently at different 
organizational levels (i.e. different decision strategies are appropriate at different organizational levels). 


Benefit—cost (B—C) analysis has long been a part . 
` of the decision-making process related to major 


investments in U.S. federal government water 
and transport projects (Maass, 1966; Prest & 
Turvey; 1965). The concept has also been incor- 
porated into various federal budgeting systems 
(e.g. zero-base budgeting). Nevertheless, wide- 
scale use of B-C. analysis for evaluating capital 
expenditures (investments) on a project-by- 
project basis has not occurred in the U.S. federal. 
government (Bozeman, 1984; Sumners, 1986). 
Concern over the US. federal government’s 
deteriorating infrastructure and large budget 
deficits has recently resulted in much attention 
being focused on the subject of federal capital 
investments. At the same time, the OMB (Office 
of Management and Budget) has recently taken 
steps to aggressively pursue a policy of requiring 
federal government managers to use B—C analy- 
sis in preparing requests for certain types of 
capital expenditures. More specifically, begin- 
ning with the budget requests for the 1988 fiscal 


year (FY), the OMB requires the use of B-C] 
analysis, based on risk-adjusted discounted cash’ 
flow techniques (hereafter referred to as B—C 
analysis techniques), for major initiatives con- 
cerning the acquisition of information technol- 
ogy systems. The techniques advocated include 
the net present vale internal rate of return and 
B-C ratio.’ i 
Ostensibly, the basic assumption underlying 

the OBM’s directive is that the use of B—C analy- 
sis techniques should facilitate a more efficient 
allocation of resources than if such techniques 
were not used. The objective of the research 
reported in this paper is to examine the relation- 
ship between the use of B—C analysis techniques 
and the allocation of resources in the executive 
branch of the U.S. federal government. In pursu- 
ing this objective, a key issue that will be 
examined is whether B-C analysis techniques 
are utilized differently at different organizational 
levels. : 

The remainder of this paper will proceed as 


*The author wishes to express his appreciation for comments received on earlier versions of this work by the research 


workshop participants at Columbia University, Harvard University, McMaster University and Pennsylvania State University. 


_ The comments on this work by S. Fettus, A Schick, K. Smith, A. Stark, J. Tsay and two anonymous referees are also appreciated. 


‘in the private sector, these techniques are commonly used and often referred to as sophisticated capital budgeting methods 
(Gordon & Pinches, 1984; Haka et al., 1985; Klammer, 1972; Moore & Reichart, 1983; Pike, 1983; Rosenblatt & Jucker, 1979, 


Scapens & Sale, 1985). 
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follows. A discussion of the growing importance 
of federal capital investments is provided in the 
first section. This discussion draws upon the re- 
lated literature in the field and sets the back- 
ground for the specific hypotheses developed in 
the second section. These hypotheses relate to 


the ‘relationship between B-C analysis tech-. 


niques and resource allocation decisions at the 
department (agency ) level and at the level of the 
OMB. 

The setting and. results of an empirical study 
are discussed in the third section of the paper. 
The ‘results of the study indicate that, at the de- 
partment level, the federal government’s use of 
B-C ‘analysis techniques has a significant effect 
on the resource allocation process associated 
with capital expenditures related to information 
technology systems. However, the effect of 
these techniques at the OMB’s level is not nearly 
as clear. These findings provide empirical sup- 
port for the argument that different decision 
strategies are utilized at different organizational 
levels (Dirsmith & Jablonsky, 1979a,b; 
Thompson, 1967). In the fourth, and final, sec- 
tion of the paper, some concluding comments 
are offered. 


FEDERAL CAPITAL INVESTMENTS 


The need for improved federal capital invest- 
ment information has beén a key concern of the 
US. General Accounting Office (GAO) during 
the last decade. In 1981, the GAO published a re- 
port which strongly criticized the management 
of federal capital investments. In that report, it 

- was noted that the planning and control of fed- 
eral capital investments are carried out in a 
haphazard fashion. In the GAO’s (1981, p. 12) 
words: “No federal organization is responsible 
for evaluating or assessing capital investment as 
a discrete policy issue or for taking a cross-cut- 
ting look at capital investment to see how it 
effects national priorities.” In the same report, 
the GAO (p. 95) went on to recommend that“. .. 
a policy-level approach to capital investment 
must be added to the Federal Government’s 
decision-making, and that sound, up-to-date in- 
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formation is needed to support that approach.” A 

subsequent GAO (1983) report expanded upon 
the above issue by discussing the pros and cons 
related to various ways of presenting capital in- 
vestment information in the. federal budget. 
While favoring a unified rather than a dual 
budget, the GAO consistently argued that capital 
investments need to be clearly delineated within .. 


the federal budget. In 1985, the GAO wrote a ` f 


two volume report which included a reiteration `. 
of the need for clearly specifying the level and 
types of federal capital expenditures. 

The General Accounting Office has not been 
alone in its concern over the need for improved 
information on U.S. federal capital investments. 
Choate & Walter (1981), for example, called for 
the creation of a national capital budget analysis 
as a means of helping to stem what they per- 
ceived as a rapidly deteriorating infrastructure. 
Eisner (1984, 1986) and Eisner & Pieper (1984) 
have noted that a comprehensive analysis of the 
federal government’s capital expenditures 
would reveal a far less gloomy picture of the real 
U.S. federal deficit. Gordon et al. (1986), based 
on a simulation experiment designed to 
examine the potential macroeconomic effects of 
the federal government switching from a unified ` 
to a dual budget, also called for better capital in- 
vestment information. 

In 1984, Congress’ interest in the need for 
improved capital investment management cul- — 
minated in P.L. 98-501 (ie. Public Works Im- 
provement ‘Act of 1984—Federal Capital Invest- 
ment Program Information Act of 1984). The 
major provisions of this Act established the Na- 
tional Council on Public Works and required it , 
to provide (over a 3 year period) a consolidation - 
of existing information on the condition of the 
nation’s infrastructure. This legislation also re- 
quires the President’s budget to include an iden- 
tification and analysis of capital spending pro- 
grams. The capital spending information, which. 
is prepared by OMB as a supplement to Special 
Analysis D, is a permanent requirement. 

The increasing focus on information related to 
federal capital expenditures is consistent with, if 
not a prerequisite for, greater use of B—C analysis 
techniques for selecting individual capital in- 
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vestments. That is, effective capital expenditure 
decision-making is dependent on the separation 
of capital from operating expenditures. As 
Bowsher (1985, pp. 18—19) noted: 


The separation’ of capital and operating expenditures 
within the unified budget would: Elevate the visibility of 
capital investment: décision. Facilitate the development 
of replacement planning. Allow a comparison of the 
long-term costs and benefits of capital investments ac- 
ross budget functions (emphasis added). 


Concern over the process by which various: 


capital expenditures are chosen has already sur- 
faced. Nowhere is this concern more obvious 
than with capital expenditures related to infor- 
mation technology systems. Section 43.2(c) of 
the 1986 version of the OMB’s Circular A-11 (Le. 
the OMB’s instructions on-the preparation and 
submission of FY 1988 budget estimates) re- 
quires the use of “rational economics”-based B— 
C analysis techniques for major initiatives re- 
lated to information technology systems (ITS).? 
More to the point, major initiatives regarding in- 
formation technology systems are expected to 
show a positive net present value, based on a 
10% discount rate (unless otherwise justified on 
non-financial criteria). The computation of in- 
ternal rates of return, B—C ratios and the use of 
sensitivity analyses are also encouraged by OMB. 


HYPOTHESES 


‘The two basic objectives usually associated 
with the use of B—C analysis techniques in the 
public sector are to encourage: (1) the efficient 
allocation of resources (i.e. achieve a maximum 
level of output for a oe level of inputs or 





minimize inputs for a given level of output), and 
(2) an equitable distribution of wealth (income) 
among different classes of people and regions 
(Maass, 1966; Haveman & Weisbroad, 1977). 
However, as many have discussed, the notion of 
efficiently allocating resources (i.e. the “rational 
economic” philosophy) clearly has been the 
dominant of the two objectives (Haveman & 
Weisbroad, 1977; Maass, 1966; Rowen, 1977; 
Williams, 1977; Wildavsky, 1966). For example, ° 
Haveman & Weisbroad (1977) noted: , 


Cost—benefit analysis has been concerned primarily with 
efficiency in the allocation of resources. . 

Insofar as cost—benefit analysis is directed at allocative 
efficiency, it can be viewed as an attempt to replicate for 
the public sector the decisions that would be made if pri- 
vate markets worked satisfactorily. That is to say, the effi- 
ciency orientation of cost-benefit analysis can be viewed 
as an attempt to develop a public sector analogue for pri- 
vate market decision-making ...(p. 137). 


A government initiative toward requiring the 
use of B-C analysis techniques, such as that re- 
quired by section 43.2(c) of Circular A-11, does 
not guarantee an efficient allocation of federal 
resources. Indeed, many government benefits 
and costs cannot be valued in a competitive mar- 
ketplace and hence the conventional notion of - 
comparing benefits to costs in such an environ-. 
ment is fuzzy, at best. Even in the private sector; 
where the marketplace is assumed to work well 
in terms of valuing the benefits and costs of a 
project, empirical studies cast doubt on the gen- 
eral existence of a positive relationship between 
the adoption of risk-adjusted discounted cash 
flow techniques for capital investment decisions 
and firm performance (Christy, 1266; Klammer, 
1973; Pike, Dii Haka et al., 1985).í Neverthe- 


2Expenditiires on such systems are nontrivial. For example, President Reagan noted in Management Of the United States 


Government, 1987 (p. 22): “. 


- In 1986, approximately $15 billion will be spent on information technology, about 1.6% of 


the total budget.” Further, as OMB gains experience with B-C analysis techniques, there is no reason not to expect these 


techniques to be required for evaluating a broadening array of capital expenditures. 


*The, 10% discount rate is based on the OMB's Circular A-94 (1971). 


‘Other factors could account for the private sector findings. For example, several researchers have argued that only firms 
facing a low degree of environmental uncertainty can reap the benefits derived from using sophisticated capital budgeting 
techniques (Sundem, 1974, 1975; Fama, 1977; Schall & Sundem, 1980; Harpaz & Thomadakis, 1982; Haka, 1987). It has also 
been pointed out that the effective use of these techniques of capital budgeting may be dependent on the use of long-term 
managerial incentive plans (Haka, 1987; Pike, 1985; Statman & Sepe, 1984), degree of decentralization (Haka, 1987) and/or 
ona broad scope management accounting system (Berg, 1965; Christy, 1966; Gordon et al., 1979; Gordon & Pinches, 1984). 
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less, from a “rational economic” perspective (Le. 
holding ‘non-economic factors constant), B—C 
analysis techniques should assist in moving the 
decision-making process for federal capital in- 
vestments in the direction ofa more efficient al- 
location of resources. The simple need to con- 
sider the time value of money (ie. discounting 
cash flows) is sufficient to assure this argument 
(Haveman & Margolis, 1977). Further, as Rowen 
(1977, p. 547) notes, B—C analysis assists “... in 
the formulation of objectives and of alternative 
actions as well as contributing to the process of 
choice between them.” These latter activities 
also should assist in the efficient allocation of re- 
sources to federal capital investments. 

In an ideal world, the argument concerning a 
positive relationship between the use of B—C 
analysis techniques and the efficient allocation 
of federal resources should be tested by 
assessing the effect of such techniques on the 
`- government’s ultimate level of performance. 
Unfortunately, a well accepted yardstick for 
measuring performance levels in the public sec- 
tor does not exist (Flynn, 1986; Anthony & 
Herzlinger, 1975). However, a weak-form test of 
the above argument is to assess the intermediate 
relationship between the use of B—C techniques 
and resource allocation decisions associated 
with particular federal capital expenditure re- 
quests. That is, in order for there to be a positive 
relationship between the use of B—C analysis 
techniques and the efficient allocation of re- 
sources, resource allocation decisions would 
have to be affected by the use of such analysis. If 
there were no effect at this intermediate stage, a 
positive relationship between the final perform- 
ance and the use of B-C analysis could not 
occur. Thus, an association between the use of 
B-C analysis techniques and resource allocation 
decisions is a necessary, though not sufficient, 
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condition for B—C analysis to have an effect on 
the efficient allocation of resources. 

In the executive branch of the U.S. federal 
government, the argument that B—C analysis 
techniques can affect resource allocation deci- 
sions can be viewed from at least two levels: (1) 
the department (agency), and (2) the OMB.” At 
the department level, the use of B—C analysis 
techniques by a particular subunit proposing a 
capital expenditure should enhance the proba- 
bility that senior administrators within the de- 
partment will support the project. That is, it may 
be argued that economics-based B—C analysis 
techniques should increase the chances that a 
capital expenditure request will be included in 
the budget request forwarded to the OMB by a 
department relative to projects not supported 
by such an analysis. This should occur because 
the B—C analysis techniques present a clear 
economic argument for a project, after consider- 
ing the various options.® 

In contrast, it could be argued that a political, 
rather than “rational economic”, perspective 
may be dominant at the department level of de- 
cision-making and thus no relationship will exist 
between B—C analysis techniques and resource 
allocation decisions. This latter argument is pre- 
mised on the belief that department level 
administrators are close to the political (social) 
reality of the decisions being made and thus will 
be most strongly influenced by noneconomic 
concerns. 

Whether you argue for the economic or polit- 
ical perspective of department level decision- 
making is in large part a function of the type of 
projects under consideration. For some pro- 


_ jects, such as those related to cost minimization, 


it seems reasonable to expect an economic per- 
spective domination. For other projects, such as 
those related to socially required “must do” pro- 


“The U.S. federal budget process requires the various federal departments (agencies) to forward funding requests to the OMB. 
The OMB ultimately chooses the final items for inclusion in the budget sent to Congress. As noted in the introduction, our 
concern in this paper is limited to the executive branch of the federal government and hence the Congressional level of 


funding approval will not be considered. 


‘Whether or not B-C analysis techniques provide a substantive economic argument could, of course, be debated. For 
example, there is a body of literature which suggests that techniques such as these are used to legitimate plans and actions 
to upper level managers via the “appearance” of rationality (DiMaggio & Powell, 1983; Meyer, 1986; Meyer & Rowan, 1977). 
For purposes of this paper, we assume that B-C analysis techniques have at least a component of substantive rationality. 
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jects, it seems reasonable to expect a political 
perspective domination. The type of projects 
under investigation in this study are of a cost re- 
duction nature (i.e. none of the projects are of a 
“must do” type). In other words, for a given level 
of information processing output, the primary 
concern should be to determine the information 
technology system configuration which has the 
minimum cost. For these types of projects, it 
seems reasonable to expect an economic 
perspective domination. This argument can be 
empirically examined by testing the following 
general hypothesis: 


H1. At the department level, an information technology 
system project has a greater likelihood of obtaining fund- 
ing support when B-C analysis techniques are utilized 
than when they are not used. 


At the level of OMB, it also may seem reason- 
able to expect a positive correlation between 
the likelihood of an ITS project being approved 
and the use of B—C analysis techniques. Again, 
this argument is based on the “rational econ- 
omic” perspective and is a relative one in that it 
suggests that projects supported by B—C analysis 
techniques have a comparative advantage, 
ceterts paribus, over projects not supported by 
such an analysis. However, the argument at the 
OMB level is less cogent than at the department 
level, even for cost-reducing type projects, for at 
least two reasons. First, as projects move up the 
organizational ladder the importance of “politi- 
cal rationality” vis-a-vis “economic rationality” 
may become more prominent in the U.S. federal 
government. More specifically, whereas the 
economic aspects of benefit—cost analysis may 
dominate at the department level, compromise 
and judgement may be more appropriate con- 
cerns at the decision level of the OMB. This line 
of reasoning suggests that the use of economics- 
based B-C analysis techniques may have no 
significant effect on a project’s likelihood of ob- 
taining funding at the OMB level, regardless of 
the project type being considered. The fact that 
different decision strategies may be appropriate 
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at different levels of organizational activity is 
supported by the work of Anthony (1965), Berg 
(1965), Dirsmith & Jablonsky (1979a,b), Par- 
sons (1960) and Thompson (1967).” 
The second possible reason for not expecting 
a significant relationship between the likelihood 
of a project being approved and the use of 
B-C analysis techniques at the OMB’s decision- 
making level is that it is difficult to imagine a situ- 
ation where OMB decision-makers could care- 
fully examine the quantitative details provided 
in all the requests received from the various de- 
partments (agencies). Indeed, a primary reason 
for the OMB requiring the departments to use 
lysis techniques is to have such tech- 
niques considered at the level where they are 
most likely to be a manageable screening device. 
Alternatively, one could. argue that the B-C 
analysis techniques provide the OMB with a fil- 
tering mechanism for dealing with the large vol- 
ume of budget requests. This latter argument 


would suggest a significant relationship between . ` 


the use of B-C analysis and a project’s level of 
funding. Hence, we again have an empirical 
question which can be tested by the following 
second general hypothesis: 


#72. At the level of the OMB, an information technology 
system project has a greater likelfhood of obtaining fund- 
ing support when B-C analysis techniques are utilized 
than when they are not used. 


The two general hypotheses noted above can 
be tested as shown below. In both cases, rejec- 
tion of the null hypothesis would be consistent 
with acceptance of the above stated substantive 
hypothesis. 


Hy: PSpUBC = PS)IBC 

Hu: PSpUBC > PSpIBC, 

where PSpUBC = likelihood of funding support at the de- 
_. „partment level (Le. a project is included in the depart- 

ment’s budget request sent to the OMB) for a project 

where B-C analysis techniques are used; and PS IBC = 

likelihood of funding support at the department level (Le. 

a project is included in the department’s budget request 

sent to the OMB) for a project where B—C analysis tech- 

niques are zot used. 


*This argument also is consistent with the argument that a contingency approach to the use of management accounting 
systems is required (Gordon & Miller, 1976; Gordon et al., 1978; Otley, 1980; Larcker, 1981 } 
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Hz: PSomsUBC = PSoms IBC 

Hix: PSoun UBC > PScug IBC, 

where PSo,,UBC = likelthood of funding support at the 
OMB level (Le. a project is included in the OMB’s budget 
request sent to Congress) for a project where B-C analy- 
sis techniques are used; and PSqyIBC = likelihood of 
funding support at the OMB level (Le. a project is in- 
cluded in the OMB’s budget request sent to Congress) for 
a project where B-C analysis techniques are not used. 


EMPIRICAL STUDY 


In order to test the above hypotheses, an em- 
pirical study was conducted. The setting of the 
study and its design are discussed below. 


Setting 

Althought B—C analysis techniques (based on 
discounted cash flows) are now required by the’ 
OMB for the acquisition of key information 
technology systems, some federal government 
departments have used such techniques prior to 
FY 1988 budget requests. For most of these de- 
partments, however, these techniques have not 
been consistently applied to all such requests. 
The study described herein relates to one such 
department.® That is, our study was conducted 
within a major federal government department 
in which, previous to section 43.2(c) of Circular 
A-11, B-C analysis techniques were used in 
some subunit requests for ITS, but not in others. 
The specific techniques utilized included the 
net present value, internal rate of return and the 
benefit—cost ratio. 

The study included data on 25 ITS requests, 
over a 4 year period (FY, 1984-87). The re- 
quests included .all ITS proposals ranging in 
value from $100,000 to $2,000,000.’ The prop- 
osals were classified as being prepared either 
with the aid of B-C analysis techniques (as de- 
fined earlier) or ignoring B—C analysis tech- 
niques, where the latter category included what 





®*The department was promised anonymi 
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could only be described as feeble attempts at ap- 
plying B—C analysis concepts. "° 

The activities carried out by the individual 
subunits within the department vary in some re- 
spects, but all subunits are essentially working 
toward similar overall objectives by virtue of 
their being in the same department. Further, ITS 
requests are for similar tasks across subunits in 
that they all relate to information processing 
hardware and software needs. This latter point is 
particularly true in terms of acquisitions costing 
$100,000 or over. In other words, whereas there 
are numerous requests for small ITS acquisi- 
tions, most requests for major acquisitions (i.e. 
$100,000 or over) came from those units with 
heavy information processing responsibilities. 
Hence, the subcategory of capital expenditures 
under analysis provides a group of expenditures 
which are reasonably homogeneous across sub- 
units in both purpose and relative size (i.e. they 
were all considered large, in terms of dollars, by 


department standards). 


The setting described above provided a un- 
ique opportunity for examining the effects of 
using B—C analysis techniques in the executive 
branch of the federal government. More to the 
point, the setting provided data on a department 
in which subunits voluntarily chose to use B-C 
analysis techniques for justifying some projects, 
but not others, prior to the OMB’s mandatory re- 
quirement of the use of B—C analysis. All of the 
projects considered in the study are of a “cost re- 
duction”, and not a “must do”, nature. Further, 
there is no a priori reason to assume that the 
projects supported by B—C analysis techniques 
are financially more attractive and thus more 
amenable to such techniques. Hence, we have 
what seem to be two unbiased samples from the 
population of all projects and thus it was possi- 
ble to examine the relationship between the use 
of B—C analysis techniques and department level 
funding support for an ITS project. 


ymity. 
*The $100,000 minimum was established as the significant cut-off level by the senior management of the department under 
study. In other words, ITS projects costing $100,000 or more were considered as being significant in amount by the 


department. 
10A panel of four experts categorized the proposals. 
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Support for a project was determined in a 
dichotomous fashion. If a project was included 
in.the final budget sent to the OMB, it was 
deemed to have received funding support from 
the department’s senior management. Alterna- 
tively, if a project was not included in the final 
budget request, it was considered as not receiv- 
ing such support. By following up on which ITS 
projects were approved by the OMB, and thus 
included in the budget that the President sent to 
Congress, it was also possible to examine the re- 
lationship between the use of B—C analysis tech- 
niques and OMB level funding support for an ITS 
project. Here again, support for a project was de- 
termined in a dichotomous fashion. A project 
was considered supported or not supported 
based on whether it was included in the final 
budget sent to Congress by the President." 


Results 

Although all projects were considered large 
by department standards, the size (in terms of 
dollars) could still have an effect on whether B— 
C analysis techniques are utilized. Hence, before 
checking the main hypotheses, the 25 projects 
were divided into two groups: (a) above the me- 
dian, and (b) below the median. The median 
project itself (which did not use B-C analysis) 
was deleted from the analysis (i.e. only 24 obser- 
vations were utilized ). Of the 12 project propos- 
als below the median, four used B-C analysis 
techniques and eight did not. Of the 12 projects 
above the median, six project proposals used B— 
C analysis techniques and six did not. Based on a 
Fisher Exact Test, there does not appear to be 
any significant size effect on the use of B—C 
analysis techniques for the department under 
study (i.e. p = 0.340). Hence, all 25 projects 
were used to test the main hypotheses discussed 
in the previous section of this paper. 

Of the 25 initial requests, 10 were classified as 
being prepared based on B—C analysis tech- 
niques and 15 were considered to ignore these 
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concepts. Out of the 10. requests using B-C 
analysis techniques, 9 received funding support 
at the department level (i.e. were included in the 
department’s budget request sent to the OMB) 


_and 7 of these received funding support at the 


OMB level (i.e. were included in the OMB’s and 
thus the President’s budget request sent to Con- 
gress). Out of the 15 requests ignoring B—C 
analysis techniques, 8 received funding support 
at the department level and 7 of these received 
funding support at OMB’s level. Given the sam- 
ple size and dichotomous nature of the data (i.e. 
a proposal either does or does not receive sup- 
port), a one-tailed Fisher Exact Test was used to 
statistically test the two main hypotheses of this 
paper. The flow of proposals and data summary 
‘used for testing the main hypotheses are pro- 
vided in Fig. 1. 

As indicated in Fig. 1, the first null hypothesis, 
discussed in the first section of this paper, is re- 
jected at the p = 0.10 level. More specifically, 
the hypothesis would be rejected up to the p = 
0.065 level. Thus, the use of B—C analysis tech- 
niques apparently had a significant positive rela- 
tionship with senior administrators’ support for 
an ITS request in the department under study. In 
contrast, the second null hypothesis cannot be 
rejected at the p = 0.10 level, or at any other rea- 
sonable level for that matter (i.e. p = 0.547). 
Hence, the use of B—C analysis techniques in the 
preparation of ITS proposals apparently did not 
have a statistical relationship with the final fund- 
ing approval of a program at the OMB level. 


Discussion 

The above findings support the conceptual ar- 
gument that differing decision strategies are 
utilized at different organizational levels. More 
specifically, it would appear that B—C analysis 
techniques are more likely to affect resource 
allocation decisions (i.e. serve as a means of br- 
inging “rational economic” discipline to the de- 
cision-making process) at lower rather than 


"The use ofa dichotomous variable for support or nonsupport of a project at both the department and OMB level was based 
on the belief that the key issue was to assess whether a project made the budget agenda. The fact that a project gets funded 
for less than the initial request is expected in the federal government budgeting process, Futher, additional funding could be 
forthcoming in future years for projects recciving funding in previous years. It should be noted, however, that all projects 


supported were supported at 2 substantial level of the original request. 
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Fig. 1. Flow of proposals and data summary. 


Department Level 
Summary 


Fisher Exact Test 
P =0.065 
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Congress 





OMB Level 
Summary 









Fisher Exact Test 
P =0.547 


UBC, project proposals using benefit—cost analysis techniques; IBC, project proposals ignoring benefit—cost analysis tech- 
niques; S, received funding support; SN, did not receive funding support; — projects for which funding was requested at 
a previous organizational level; S1 ... S, department subunits 1 to rt. 


higher decision-making levels. 

An alternative explanation for the findings of 
this study could be provided. That is, it could be 
argued that the ITS projects forwarded to OMB 
were all strong ones, whether or not B—C tech- 
niques were used. Based on this reasoning, there 
may be little surprise at the findings that the 
OMB only cut a few projects and that the cuts for 
those proposals using B—C analysis techniques 
were not significantly different from those not 
using such techniques. Of course, this explana- 
tion could be viewed as a comiplement to, rather 
than a substitute for, the argument that B-C 
analysis techniques are a more useful filtering’ 
device at the lower levels of decision-making 
than at higher levels. The OMB may well have in- 





itiated the requirement regarding the use of B—C 
analysis techniques at the department level. [i.e. 
section 43.2(c) of Circular A-11].so as to allow 
itself the opportunity to concentrate on alterna- 


` tive decision strategies. 


An interesting point concerning our findings 
relates to the often noted argument that the 
OMB initiates scientific budgeting techniques, 
such as B-C analysis techniques, in an effort to 
co-op more power for itself vis-a-vis the depart- 
ments and agencies. That is, it has been argued 
that such techniques shift the policy-making 
power from a pluralistic democracy, dominated 
by a multitude of interest groups, to a centraliz- 
zed policy-making body which solves problems 
through formal authority (Gordon & Schick, 


'?To the extent that proposals using B-C analysis techniques are correlated to the quality of projects, these results could be 
interpreted as measuring the effect of quality rather than the use or nonuse of B—C analysis. Upon questioning the experts 
who originally classified the proposals into use or nonuse of B-C techniques, the general consensus was that project quality 
was unrelated to the use/nonuse of B—C analysis techniques. Of course, the proposal quality and the B—C use/nonuse issue 
may be related, which is consistent with the economic screening device argument. 


a 
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1979). The findings presented in this paper tend 
to support the opposite view. Although B-C 
analysis techniques may lead to more centraliza- 
tion in terms of the checks on lower level deci- 
sions, the evidence presented in this study 
suggests that department level managers still 
have a critical influence on the final resource 


‘ allocation decisions. In fact, the use of B-C 


analysis techniques tends to solidify, rather than 
diffuse, the decisions made at the department 
level. Of course, these findings may merely re- 
flect the fact that applying B—C analysis tech- 
niques on a project-by-project basis (as 
examined in this study) is different than moving 
from an incremental to a rational-comprehen- 
sive form of overall budgeting as discussed by 
Gordon & Schick. In other words, B—C analysis 
techniques applied to individual projects can be 
accommodated within the incremental budget- 
ing framework, as evidenced by the fact that 
many approved ITS projects were not supported 
at 100% of the initial funding request. 

The setting of this study could be viewed in 
terms of the notions underlying agency theory 
research. One agency relationship involved in 
this study is between OMB decision-makers (as 
principals) and the senior managers in the de- 
partments (as agents). Another agency relation- 
ship is between the senior department level 
managers (as principals) and the lower level 
subunit heads (as agents). In both cases, the 
agents have access to information not readily 
accessible to the principals (i.e. there is an'asym- 
metry of information). Further, conflicts of in- 
terest are present in the public sector as well as 
the private sector (i.e. divergence of prefer- 
ences ). Stated differently, public sector bureauc- 
rats have incentives to seek wealth transfers (i.e. 
gain control over resources) in the same manner 
as business people (Watts & Zimmerman, 1986, 
pp. 226-227). Thus, the standard agency prob- 
lems are present in the setting of the study under 
investigation. 

Viewing this study from an agency framework 
raises the following interesting issue. Key 





ingredients to minimizing agency costs are “ob- 
servability” and “truth inducing ` incentive 
mechanisms”. Unfortunatély, neither of these 
factors seem to be present in the situation de- 
scribed in this study. That is, little, if any, ex post 
observability of the results of approved ITS pro- 
jects takes place. Further, truth-inducing 


' mechanisms (e.g. managerial incentive plans) 


are conspicuous by their absence in the federal 
government.'? Thus, an interesting question to 
ask is: What is there to prevent agents from pre- 
senting B—C analysis “signals” which are based 


-on excessively optimistic projections? This 


question is particularly relevant in today’s envi- 
ronment where B-C analysis is required for all 
major ITS initiatives and resources are presuma- 
bly scarce (ie. federal budget deficit cutting is a 
major political issue). Although the answer to 
this question is beyond the scope of this paper, it 
does point in the direction of an interesting area 
of future research. 

Another area of potential future research con- 
cerns the signalling effect of B—C analysis tech- 
niques at the department level. Now that such 
techniques are required for all major ITS initia- 
tives, it would be’ interesting to examine 
whether some attributes of B—C analysis tech- 
niques affect resource allocation decisions diffe- 
rently than others. It also would be interesting to 
determine whether B—C analysis techniques be- 
come more commonly used at the department 
level for non-ITS projects. 


CONCLUSIONS 


Drawing general inferences based on one 
study, especially one of a small sample size, is 
obviously unwise. Nevertheless, the findings of 
this study provide optimism for a better under- 
standing of the usefulness of discounted cash 
flow techniques associated with benefit—cost 
analysis. More specifically, this study confirms 
the argument that the use of B—C analysis tech- 
niques affects the resource allocation process 


‘’The U.S. federal government has some incentive plans, but the relationship between these plans and managerial behavior 


is obtuse at best. 
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differently at different organizational levels (i.e. 
different decision strategies are appropriate at 
different organizational levels). 

Further, it would appear that B—C analysis 
techniques can and do affect resource allocation 
decisions in a manner consistent with the “ra- 
tional economic”-based arguments. That is, 
these techniques introduce an added degree of 
discipline and screening which, at a minimum, 
are consistent with facilitating economic plan- 


ning and stemming economic stress. However, 


these techniques seem to be most useful as a first 
line of defense at lower, rather than higher, 
levels of decision-making. As such, they repre- 
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sent but one item in the arsenal of weapons that 
organizations may use to improve overall 
performance. Indeed, given the multiplicity of 
decisions and decision levels which affect or- 
ganizational performance, any effort to correlate 
one decision strategy with ultimate organiza- 
tional performance must be approached with 
extreme caution. A more fruitful line of inquiry 
may be to look at the effects of a decision 
strategy on the intermediate process of resource 
allocation decisions, rather than the ultimate 
outcome. This was the tack pursued in this 


paper. 
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Abstract 


Since internal auditors are subject to incentives and sanctions controlled by management, concern exists 
about their professional objectivity. A decision-making experiment which employed 58 internal auditors 
from three banks as participants was performed to examine the question: Can a firm's management bias the 
professional objectivity of the firm's internal auditors? The results indicate the internal control system 
evaluations reached by internal auditors who were not members of the Institute of Internal Auditors (IIA) 
were biased by knowledge of management's desired evaluation outcomes. ILA members, however, resisted 
management's efforts to bias their evaluations. This outcome suggests that IA membership may be an 
important determinant of internal auditors’ professional objectivity, an issue which has not previously been 


identified in the accounting literature. 


The professional objectivity of internal auditors 
is an important issue, for third parties often rely 
on the work of internal auditors. A firm’s board 
of directors, for example, must often rely on the 
judgment of the firm’s internal auditors as to the 
adequacy and effectiveness of the organization’s 
system of internal control (Institute of Internal 
Auditors, 1981, p. 1). In addition, nearly all 
external auditors rely upon the work of internal 
auditors to some extent (Ward & Robertson, 
1980). Since internal auditors are subject to 
incentives and sanctions controlled by their 
firm’s management, concern exists about 
management’s potential ability to bias their 
professional objectivity (American Institute of 
Certified Public Accountants [AICAP], 1986, AU 
322.07). The Institute of Internal Auditors (IIA), 
the professional organization for internal 
auditors, recogonizes the importance of profes- 


sional objectivity and conducts educational 


-activities which emphasize this issue (IAA, 


1981). Further, IIA professional standards 
require that members “. . . are not to subordinate 
their judgment on audit matters to that of 
others.” (IIA, 1981, pp. 100-102). Members of 
the IIA may, therefore, resist any efforts by man- 
agement to bias their professional objectivity. 
Accordingly, the research problem considered 
in this paper may be stated as follows: Can a 
firm’s management bias the professional objec- 
tivity of the firm’s internal auditors? 


THEORETICAL BACKGROUND 
Internal auditors are members of the organiza- 


tions which employ them and may, therefore, be 
influenced by incentives and sanctions con- 
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‘trolled by their organization’s management. A 
manager’s ability to influence, persuade and 
motivate followers is thought to be derived from 
the power the manager is perceived to possess. 
French & Raven (1960) identified five forms of 
managerial power: (1) coercive power, (2) 
reward power, (3) legitimate power, (4) expert 
power, and (5) referrent power, in their fre- 
quently referenced article. A manager’s coercive 
power is based on fear, as when a follower or 
subordinate believes that failing to comply with 
a manager’s wishes might result in punishment 
or some other unattractive outcome at some 
later time. A manager’s reward power is derived 
from the expectation a subordinate may receive 
rewards (e.g. praise, recognition or an increase 
in income) in return for compliance with a 
manager’s wishes. The legitimate power of a 
manager is derived from the manager’s position 
in the organizational hierarchy, e.g. the man- 
ager’s position as an executive. A manager’s ex- 
pert power is derived from some special skill, ex- 
pertise or knowledge which the manager is per- 
ceived to possess. A manager’s referrent power 
is based on the manager’s personal attractive- 
ness or appeal. Admired managers are said to 
possess charisma, the ability to attract and in- 
spire followers (French & Raven, 1960, pp. 607— 
623). 

In the related accounting literature (e.g. 
AICPA, 1986, AU 322.07; Brown, 1983; Gibbs & 
Schroeder, 1980; Schneider, 1985), it is 
emphasized that internal auditors should be in- 
sulated from the influence of their firm’s man- 
agement by providing the appropriate organiza- 
tional status to the internal audit department. 
Such an approach aims at an organizational 
structure that makes it inappropriate for the 
firm’s management to exert legitimate power 
over the firm’s internal auditors. Even if this ap- 
proach is successful, however, it seems likely 
that a firm’s management can still exert some 
level of influence on the firm’s internal auditors 
because the internal auditors perceive their 
firm’s management to possess either coercive 
power, reward power, expert power or refer- 
rent power. 

Although internal auditors are members of the 
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organization by which they are employed, they 
must exhibit professional objectivity if their 
work is to be relied upon by third parties outside 
their organization (Ward & Robertson, 1980; 
Brown, 1983; Gibbs & Schroeder, 1980; 
Schneider, 1985). The Institute of Internal 
Auditors (IIA) defines professional objectivity as 
“,..an independent mental attitude which inter- 
nal auditors should maintain in performing 
audits” (IIA, 1981, pp. 100—102). Professional 
objectivity is associated with membership in a 
profession, which is frequently characterized by: 
(1) a belief in and acceptance of the goals and 
values of the profession, (2) a willingness to 
exert considerable effort on behalf of the profes- 
sion, and (3) a desire to maintain membership in 
the profession (Aranya et al., 1981; Sorensen & 
Sorensen, 1974). The Institute of Internal 
Auditors (IIA), the professional organization for 
internal auditors, was created in 1941 as an in- 
ternational organization which emphasizes the 
development ofinternal auditing as a profession. 
Members of the IIA have agreed upon a code of 
ethics, a statement on internal auditors’ respon- 
sibilities, a program of continuing education, a 
common body of knowledge, a certification pro- 
gram and professional standards (IIA, 1981). 
Although internal auditors who are not IIA mem- 
bers may or may not perceive themselves to be 
members of a profession, it seems clear that in- 
ternal auditors who are IIA members have, 
through their membership in this organization, 
indicated they consider themselves to be audit- 
ing professionals. 

A substantial body of academic research has 
addressed the potential conflict that can exist 
between the requirements imposed by member- 
ship in an organization and the requirements im- 
posed by membership in a profession (e.g. 
Shepard, 1961; Gouldner, 1958; Blau & Scott, 
1962; Friedlander, 1971; Sorensen & Sorensen, 
1974; Flango & Brumbaugh, 1974; Aranya et al., 
1981; Norris & Niebuhr, 1984; Aranya & Ferris, 
1984). Apparently, professionals who are mem- 
bers of commercial or industrial organizations 
are more likely to experience conflict between 
the demands of their profession and the 
demands of their organization than are profes- 
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sionals who are members of professional organi- 
zations. In a recent study of over 2000 CPAs and 
CAs, Aranya & Ferris (1984) found that CPAs 
and CAs employed by commercial organizations 
perceived more conflict to exist between the 
demands of their organization and the demands 
of their profession than their contemporaries 
employed by public accounting firms. In expla- 
nation of these findings, Aranya & Ferris 
suggested that the goals of commercial or in- 
dustrial organizations may often conflict with 
the professional requirements of CPAs and CAs, 
while the goals of public accounting firms prob- 
ably coincide closely with the professional re- 
quirements of CPAs and CAs. With this in mind, 
one would expect the potential for conflict be- 
tween the demands of internal auditors’ organi- 
zations and the demands of their profession to be 
quite high, for internal auditors are usually mem- 
bers of commercial or industrial organizations. 
The potential conflict between the goals of 
their organizations and the demands of their pro- 
fession can threaten the professional objectivity 
of internal auditors. Professional objectivity is of 
particular importance in situations which in- 
volve professional judgment. Professional judg- 
ment is involved when a trained individual 
makes decisions in situations where the “cor- 
rect” decision either cannot be specified, i.e. 
where there is no normatively correct decision 
outcome, or where a decision is required cur- 
rently and the “correct” decision outcome can’t 
be determined until after the passage of time or 
the occurrence of subsequent events.’ One can- 
not, for example, determine the “correctness” of 
an internal auditor’s evaluation of the adequacy 
of an internal control system at the time such a 
decision must be reached. Since the “correct- 
ness” of decisions which involve professional 
judgment is difficult to determine, great impor- 
tance is placed on the objectivity of the decision- 
maker. An objective decision-maker is one who 








reaches decisions which are not biased in favor 
of the outcomes desired by a particular group or 
individual. Correspondingly, a decision-maker 
who reaches decisions which are biased in favor 
of the outcomes desired by some particular 
group or individual is not an objective decision- 
maker. 

Relativly little accounting research has 
focused on management’s ability to influence 
the professional decisions of members of the or- 
ganization. In one of the few such studies, Har- 
rell (1977) examined how the management 
control decisions of middle managers can be in- 
fluenced by senior management. His experiment 
involved 75 subjects who were randomly as- 
signed to five groups, a control group and four 
experimental groups. The subjects completed a 
decision-making exercise which involved 
evaluating the performance of a number of 
hypothetical organizational subunits using five 
evaluation criteria. The evaluation policy prefer- 
red by senior management was initially com- 
municated to the subjects in the four experi- 
mental groups by indicating the weights man- 
agement desired to be placed upon the five 
evaluation criteria. The subjects in three of the 
four experimental groups were also provided 
with either consonant, dissonant or random out- 
come feedback. Senior management’s influence 
was strongest in the group which received infor- 
mation indicating the weights senior manage- 
ment desired to be placed on the five evaluation 
criteria in conjunction with consonant feedback 
indicating the decision outcomes associated 
with using management’s preferred weights. 

In Harrell’s (1977) experiment, it was approp- 
riate for senior management to influence the 
management control decisions reached by the 
middle managers who served as subjects. The 
situation is, however, quite diferent when an or- 
ganization’s internal auditors, rather than the or- 
ganization’s middle managers, are involved. The 


| Most prior research of the professional judgment exercised by accounting professionals (e.g. Ashton, 1974; Ashton & 
Brown, 1980; Ashton & Kramer, 1980; Libby, 1981; Ashton, 1982) has focused on how external auditors integrate 
information cues into their professional decisions. Although this research employed a modification of Ashton's (1974) 
exercise to gather the data, the conceptual issue involved here, the professional objectivity of internal auditors, is 
fundamentaily different from the issues examined in prior studies of the professional judgment of external auditors. 
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work of the firm’s internal auditors is often relied 
upon by third parties, such as the firm’s board of 
directors (IIA, 1981, p. 1) and external auditors 
(Ward & Robertson, 1980). Accordingly, the 
professional standards of the Institute of Internal 
Auditors require its members to reach unbiased 
professional judgments (i.e. judgments that are 
not influenced by their firm’s management). 
Professional objectivity is essential if internal au- 
ditors’ work is to be relied upon by third parties 
(AICPA, 1986, AU322.07; IAA, 1981, pp. 100— 
102).? 

The need for professional objectivity is espe- 
cially great when internal auditors evaluate the 
adequacy of their firm’s internal control systems. 
The American, Institute of Certified Public Ac- 
countants: (AICPA) has recognized the import- 
ant role of internal auditors in evaluating the 
firm’s system of internal control, noting that“... 
they act as a separate, higher level of control to 
determine that the system is functioning effec- 
tively” (AICPA, 1986, AU 322.03). If, therefore, 
internal auditors’ internal control system evalua- 
tions can be significantly influenced by their 
firm’s management, then internal auditors lack 
professional objectivity, the “... independent 
mental attitude which internal auditors should 
maintain in performing audits” (IIA, 1981, pp. 
100-102). 


HYPOTHESES 


With the discussion presented above in mind, 
it is proposed that management has the ability to 
influence the internal control system evalua- 
tions reached by the firm’s internal auditors. 
Further, it is proposed that, because of the train- 
ing and professional socialization provided by 
this organization, internal auditors who are 
members of the Institute of Internal Auditors 
(A) will resist efforts by their firm’s manage- 
ment to influence their internal control system 
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evaluations. Two hypotheses are employed to. 
examine these proposals. 


H1: Internal auditors who have knowledge of manage- 
ment’s desired internal control system evaluation out- 
comes will reach Internal control system evaluations 
which are more in consonance with managements 
desired outcomes than those of their contemporaries 
without such knowledge. 


H2: Among those internal auditors who have knowledge 
of management's desired internal control system evalua- 
tion outcomes, internal auditors who are ILA members 
will reach internal control system evaluations which are 
less in agreement with management’s preferred out- 
comes than those of non-members. 


Various combinations of support or lack of 
support for the two hypotheses have different 
implications about the professional objectivity 
of internal auditors. Support for both H1 and H2 
would imply that, although internal auditors as a 
group may reach biased internal control system 
evaluations, those who are IIA members will 
resist management’s efforts to bias their profes- 
sional objectivity. Support for H1 without sup- 
port for H2 would imply that internal auditors as 
a group may reach biased evaluations, with no 
distinction being apparent between ILA mem- 
bers and ITA non-members. A lack of support for 
H1 combined with support for H2 would imply 
that, although both HA members and IA non- 
members may reach unbiased evaluations, ILA 
members and ITA non-members will differ in the 
extent to which they can be influenced by 
management’s attempts to bias their profession 
objectivity. A lack of support for both H1 and H2 
would imply that, internal auditors as a group 
may reach unbiased evaluations, with no distinc- 
tion being apparent between HA members and 
IIA non-members. 


METHOD 


It would be desirable to examine internal 


24 related body of research (e.g. Brown, 1983; Brown & Karan, 1985; Abdel-khalik et al., 1983; Clark et al., 1979; Gibbs & 
Schroeder, 1980; Schneider, 1984, 1985) examines how external auditors weight various factors when determining the level 
of reliance to place on the work of internal auditors. Since these studies focus upon external auditors’ judgment processes, 
they are only indirectly related to the issues considered in this paper. 
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auditors’ professional objectivity in a natural 
work environment. Unfortunately, the measure- 
ment of internal auditors’ professional objectiv- 
ity in a natural work setting poses substantial dif- 
ficulties. For example, would a firm’s manage- 
ment be willing to acknowledge an attempt to 
bias the professional objectivity of the firm’s in- 
ternal auditors? If such an attempt was made and 
was successful, would the firm’s internal au- 
ditors be willing to acknowledge that their pro- 
fessional objectivity had been biased? As noted 
by Swieringa & Weick (1982), the experimental 
approach aims at overcoming such measure- 
ment difficulties, since an experiment attempts 
by its very nature to simplify the real-world envi- 
ronment in order to focus upon particular rele- 
vant factors. The experimental approach allows 
relevant factors to be manipulated according to 
the researchers’ plan, so comparisons can be 
made which would not be feasible in a naturalis- 
tic setting. The practical difficulties associated 
with measuring internal auditors’ professional 
objectivity in a naturalistic setting led to the use 
of the experimental approach for this study. 


The decision-making exercise 

The decision-making exercise developed by 
Ashton (1974), with minor modifications which 
relate the decision-making task to internal 
auditors instead of the external auditors Ashton 
used as subjects, was employed to gather the 
data. This decision-making exercise focuses on 
the evaluation of payroll internal control subsys- 
tems, which was an appropriate task for the 
study participants, who were internal auditors in 
banks which had substantial payroll activities. A 
2° x 4 factoral design was incorporated into the 
presentation of the decision cue information in 
the exercise, so each participant reached 32 dif- 
ferent internal control system evaluations in 
completing the 32 cases in the exercise. The 
cases were presented in random order. Figure 1 
presents an example case from the exercise. 


The participants 
The 58 study participants were all internal 
auditors who were employed by three banks in a 


1. Are the tasks ofboth timekeeping and payment of 
employees adequately separated from the task of 
Payroll preparation? .......ccsseesenesesneecneecsseteeseneecase No 

2. Arethe tasks ofboth payroll preparation and 
payment of employees adequately separated 
from the task of payroll bank account 
reconciliation? oon... eeseeecsesnsneeennecsatecentscnnecnenserentens Yes 

3. Arethe names on the payroll checked 
periodically against the active employee file of 
the personnel department? .mes.seesesseerorerioiserensssese Yes 


4. Are formal procedures established for changing 
names on the payroll, pay rates and deductions? . Yes 


5. Isthe payroll audited periodically by internal 
BUENOS? cohen e a an aE No 


satisfactory during the previous audit? ..._.......... Yes 


5 ay g og gp 
Hoa # 
1 2 3 4 5 6 


Circle the response which most closely indicates your 
evaluation of this payroll internal control subsystem. 


Fig. 1. Sample case from the decision-making exercise. 


middle-sized southeastern city. Thirty-two par- 
ticipants were males and 26 were females. The 
average participant had worked for about seven 
years as an internal auditor, ranging from a 
minimum of less than one year to a maximum of 
28 years. In addition, the average participant had 
worked for about seven years for the current 
employer, ranging from a minimum of less than 
one year to a maximum of 35 years. Three par- 
ticipants were certified public accountants, 14 
were certified internal auditors and 35 were 
members of the Institute of Internal Auditors 
(IIA). The data were collected by one of the re- 
searchers who provided each participant with 
an exercise and subsequently collected the com- 
pleted exercises from the participants. The exer- 
cise instructions asked the subjects not to dis- 
cuss the exercise with others until after the date 
when all data were collected. Each completed 
exercise was sealed in an attached envelope to 
ensure that participants’ responses remained 
anonymous. 
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Experimental design 

The post-test-only control group experimen- 
tal design, which is described in detail by 
Campbell & Stanley (1963, p. 8 and pp. 25—26) 
and Kerlinger (1973, pp. 330-332), was 
employed in the experiment. The internal 
auditors who participated in the study were ran- 
domly assigned to two groups, the control group 
(n = 28) and the experimental group (n = 30). 
Since the participants were employed by three 
different banks, the employees of each bank 
were randomly assigned to each of the two 
groups in order to control for any possible or- 
ganizational effects. As noted earlier, the data 
gathering instruments were personally distri- 
buted by one of the researchers to 67 individuals 
and 58 individuals voluntarily provided re- 
sponses. The initial explanations, instructions, 
etc., provided to members of the two groups 
were exactly the same and the data were col- 
lected from both groups of participants at the 
same time. In order to simulate circumstances in 
which the firm’s management made a deliberate 
attempt to bias the professional objectivity of 
the firm’s internal auditors, the participants 
assigned to the experimental group were pro- 
vided information which their contemporaries 
in the control group did not receive. The addi- 
tional information included a statement that 
their firm’s management desired that equal 
weights be placed upon each of the six factors 
used to evaluate internal control systems (see 
Fig. 1) and consonant outcome feedback indicat- 
ing the evaluations that would result from 
employing these weights (see Fig. 2). The 
weight to be placed on these factors when 
evaluating an internal control system is clearly a 
matter of professional judgment (Ashton, 1974). 
Accordingly, management’s attempt to specify 
the weights to be placed on these factors there- 
fore represents an effort by management to bias 
the professional objectivity of the firm’s internal 
auditors. 


RESULTS 


The first hypothesis (H1) predicts that 


Following management’s preferred internal control system 
evaluation approach would result in the evaluation shown 
above for the preceding case. 


Fig. 2. Example of outcome feedback provided to participants. 


internal auditors who have knowledge of man- 
agement’s desired internal control system 
evaluation outcomes will reach evaluations 
which are more in consonance with manage- 
ment’s desired outcomes than those of their con- 
temporaries without such knowledge. Within 
the context of the experiment, H1 implies the 
participants in the experimental group will 
reach evaluations which differ less from the out- 
comes preferred by management than those 
reached by their contemporaries in the control 
group. The second hypothesis (H2) predicts 
that, among internal auditors who have knowl- 
edge of management’s desired internal control 
system evaluation outcomes, HA members will 
reach evaluations which are less in agreement 
with management's desired outcomes than IIA 
non-members. H2 implies that the participants 
who are IIA members will reach evaluations 
which differ from the outcomes preferred by 
management than their contemporaries who are 
not HA members. H2 applies only to individuals 
in the experimental group, for the subjects in the 
control group were not aware that management 
had specified preferred evaluation outcomes. 
Since each participant reached 32 internal 
control system evaluations, the repeated mea- 
sures analysis of variance model was initially 
employed to examine the hypotheses. The 
results are summarized in Table 1. The signifi- 
cant MGMT main effect (F = 6.87, p = 0.0088) 
in Table 1 implies that, as predicted by H1, the 
participants in the experimental group reached 
different internal control system evaluations 
than their contemporaries in the control group. 
The significant CASE main effect (F = 113.72, p 
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TABLE 1. Repeated measures analysis of variance results 








Source d.f. F P 

MGMT 1 6.87 0.0088 
CASE 31 113.72 0.0001 
MGMT Xx CASE 31 0.93 0.5783 
HA 1 0.05 0.8211 
MGMT x IA 1 6.03 0.0142 
CASE x ILA 31 1.14 0.2696 
MGMT X CASE x ITA 31 0.98 0.5056 





The overall repeated measures analysis of variance model 
was significant (F = 28.61, p = 0.0001, R-square = 0.68). 
MGMT = the effect due to providing the participants in the 
experimental group with information about the decision 
policy preferred by the firm's management. Significant 
results imply the participants in the experimental group 
reached different internal control evaluations than those of 
the control group (H1). 

CASE = the within-person effect due to manipulating the 
decision cue values (see Fig. 1) so that 32 different decision 
cases were provided to each partictpant. Significant results 
imply the individual participants reached different internal 
control system evaluations when presented with different 
cases. 

ILA = the effect due to membership in the Institute of Inter- 
nal Auditors (IIA). 

MGMT X CASE = interaction effect of MGMT and CASE 
variables. 

MGMT X IA = interaction effect of MGMT and IIA variables. 
Significant results imply IIA members in the experimental 
group reached different internal control system evaluations 
than their contemporaries who were not ILA members. Since 
the IIA variable alone was not significant, significant results 
imply the evaluations of HA members and IIA non-members 
differed in response to the MGMT variable (H2). 

CASE X IA = interaction effect of CASE and IIA variables. 
MGMT X CASE x IIA = interaction effect of MGMT, CASE 
and IIA variables. i 


= 0.0001) in Table 1 implies that, as has been 
demonstrated in a large body of prior research 
(e.g. Ashton, 1974; Ashton & Brown, 1980; 
Ashton & Kramer, 1980; Libby, 1981; Ashton, 
1982), the manipulation of the six decision cues 
(see Fig. 1) resulted in the participants reaching 
different evaluations for the 32 different cases 
which each participant completed. The signifi- 
cant MGMT x IIA interaction term (F = 6.03; p 
= 0.0142) in Table 1 implies that, as predicted 
by H2, HA members in the experimental group 
reached different evaluations than IIA non-mem- 
bers in the experimental group. These results 
clearly provide support for the differences im- 
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plied by the hypotheses. Next, individual differ- 
ence scores were employed to examine the di- 
rectional aspects of the two hypotheses. 

Individual difference scores have been 
employed previously by Barefield (1972) 
Einhorn et al. (1977) and Polk (1977). Initially, 
the absolute difference between the internal 
control system evaluation each individual actu- 
ally reached for each of the internal control sys- 
tem cases (see Fig. 1) and the evaluation each 
subject would have reached by perfectly 
employing management’s desired equal weights 
approach (see Fig. 2) was computed. Next, these 
absolute difference values were summed to 
obtain a difference score for each participant. 
The size of each participant’s difference score in- 
dicates the extent to which the individual’s in- 
ternal control system evaluations differed from 
the outcomes desired by management. A rela- 
tively large difference score indicates that the in- 
dividual’s actual evaluations differed substan- 
tially from the outcomes desired by manage- 
ment. A relatively small difference score indi- 
cates that the individual’s actual evaluations dif- 
fered only slightly from the outcomes desired by 
management. 

As noted earlier, H1 predicts that the differ- 
ence scores of the participants in the experimen- 
tal group will be smaller than those of their con- 
temporaries in the control group. As indicated in 
Table 2, the results of the experiment support 
this aspect of H1 (t = 2.85, p = 0.003). Further, 
H2 predicts that, within the experimental group, 
the difference scores of the participants who are 


TABLE 2. Difference scores of the control group and 
experimental group participant 


Difference 
scores 
X (s.d) 


18.1 (5.9) 
- (n= 28) 


11.6 (6.7) 
(n = 30) 


Hypothesized comparison: the difference scores of the par- 
ticipants in the experimental group were smaller than those 
of the participants in the experimental group (X = 11.6 vs 
X= 18.1;f= 2.8, p = 0.003). 











Control group 


Experimental group 
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ITA members will be larger than those of the par- 
ticipants in the experimental group who are not 
IIA members. As indicated in Table 3, the empir- 
ical data support the directional aspects of H2 (t 
_ = 2.0, p = 0.03). The results of the experiment, 
therefore, imply that, although internal auditors 
as a group may be influenced by management to 
reach biased internal control system evalua- 
tions, those who are IA members will resist 
management's efforts to bias their professional 
objectivity. 


TABLE 3. Difference scores for IIA non-members and IIA 








members 
IA non-members IA members 
X(s.d) X(s.d) 
Control group 20.2 (6.4) 16.7 (5.6) 
(n= 11) (n= 17) 
Experimental group 8.2 (6.7) 13.2 (6.8) 


(n= 12) (n= 18) 


Hypothesized comparison: (H2) The difference scores of IIA 
non-members (X = 8.2) in the experimental group were 
smaller than those of ILA members (X = 13.2) in the experi- 
mental group (¢ = 2.0, p = 0.03). 

Post boc comparisons [Scheffe’s t-test (Hayes, 1973, pp. 
605-607)}: 

in the control group, the difference scores of IIA non- 
members (X = 20.2) and HA members (X = 16.7) did not 
differ (p > 0.05). 

The difference scores of IA non-members in the experi- 
mental group (X = 8.2) were smaller than those of IIA non- 
members in the control group (X = 20.2) (p < 0.05). 

The difference scores of IIA members in the experimental 
group (X = 13.2) did not differ from those of ILA members in 
the control group (X = 16.7) (p > 0.05). 


The results summarized above suggest 
management may have been successful in bias- 
ing the professional objectivity of the particip- 
ants who were IIA non-members, but may have 
been unsuccessful in biasing the professional ob- 
jectivity of the participants who were IIA mem- 
bers. (The significant MGMT main effect in the 
repeated measures analysis of variance in Table 
I may reflect management’s ability to influence 
the evaluations of only ITA non-members. ) With 
this in mind, some additional analysis was per- 
formed to determine whether the participants 
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who were HA members successfully resisted 
management's efforts to bias their internal con- 
trol system evaluations. Scheffe’s t-test (Hayes, 
1973, pp. 605-607) was employed for these 
post boc comparisons,’ which are presented in 
Table 3. In the control group, the difference 
scores of ILA non-members and NA members did 
not differ significantly (Scheffe’s t-test, p > 
0.05 ). As noted earlier in the examination of H2, 
however, in the experimental group the differ- 
ence scores of IA non-members were signific- 
antly smaller than those of IA members. In addi- 
tion, the difference scores of HA members in the 
control group did not differ from those of HA 
members in the experimental group (Scheffe’s t- 
test, p > 0.05). Further, the difference scores of 
IIA non-members in the experimental group 
were significantly smaller than those of HA non- 
members in the control group (Scheffe’s t-test, p 
< 0.05). 

The results of the additional analysis may be 
summarized as follows: IIA non-members in the 
experimental group reached evaluations which 
were more in agreement with management's 
preferred outcomes than did their contem- 
poraries in the control group; under the same 
circumstances, IIA members in the experimen- 
tal group reached evaluations which were no 
more in agreement with management’s prefer- 
red outcomes than were those of their contem- 
poraries in the control group. The evaluations 
reached by the IIA non-members were, there- 
fore, influenced by knowledge of management's 
preferred outcomes while the evaluations 
reached by the HA members were not influ- 
enced by knowledge of management’s preferred 
outcomes. 

Accordingly, the overall results of the experi- 
ment (see Tables 1, 2 and 3) imply that manage- 
ment was able to bias the professional objec- 
tivity of the internal auditors who were not IA 
members. Management was not, however, able 
to bias the professional objectivity of the 
internal auditors who were ITA members. Appa- 
rently, the IIA members successfully resisted 
management's efforts to bias their internal con- 


>The use of Student's t-test, which is appropriate for a priori hypotheses, would not change the reported results. 
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trol system evaluations towards the outcomes 
preferred by management. 


_ SUMMARY AND DISCUSSION `} 


Since internal auditors are subject to incen- 
tives and sanctions controlled by management, 
concern exists about management’s potential 
ability to bias their professional objectivity. To 
examine this issue, internal auditors from three 
banks participated in a decision-making experi- 
ment which examined two proposals. First, it 
was proposed that a firm’s management has the 
ability to influence the internal control system 
evaluations reached by the firm’s internal 
auditors. A second proposal was that, among 
those internal auditors who had knowledge of 
management’s desired evaluation outcomes, in- 
ternal auditors who were ITA members would re- 
sist management's efforts to influence their in- 
ternal control system evaluations. 

Both proposals were supported by the results 
of the decision-making experiment. Accord- 
ingly, the research findings imply that although 
management has the ability to bias the profes- 
sional objectivity of internal auditors who are 
not ITA members, internal auditors who are IIA 
members will resist such efforts by their firm’s 
management. This outcome suggests that the 
Institute of Internal Auditors has been successful 
in its efforts to educate its members on the 
necessity for maintaining professional objec- 
tivity. 

The findings have implications for third 
parties, such as external auditors, who would 
rely upon the work of internal auditors. A num- 
ber of studies have examined the factors which 


external auditors use to judge how much re- 
liance to place on internal auditors (e.g. Brown, 
1983; Brown & Karan, 1985; Abdel-khalik et aL, 
1983; Clark et al., 1979; Gibbs & Schroeder, 
1980; Schneider, 1984, 1985). None of these 


_ studies identified ITA membership as a factor to 


be considered when judging internal autitors’ 
professional objectivity, The results of this 
study, however, imply that ILA membership may 
be an important determinant of internal au- 
ditors’ professional objectivity. Accordingly, the 
extent to which a firm’s internal auditors are ac- 


tive members of the IIA should be considered by 


third parties when determining the level of re- 
liance to place on the work of internal auditors. 

Some limitations and strengths of this study 
should be mentioned. The study participants 
were selected based on their availability, so.it is 
not known whether they are representative of 
the larger population to which it would be desir- 
able to generalize results. Some strengths of this 
research include the use of real internal auditors 
as participants and the use of an experimental 
design that provides the high internal validity 
necessary for the examination of the relatively 
abstract construct. of internal auditors’ profes- 
sional objectivity. 

Several issues for further research may be 
mentioned. It is not known how frequently in 
practice a firm’s management might attempt to 
bias internal auditors’ professional objectivity 
and this issue warrants further inquiry. Another 
issue worthy of investigation is whether efforts 
by management to bias internal auditors’ profes- 
sional objectivity would result in “whistle blow- 
ing” behavior by some internal auditors. It is 
hoped this research will stimulate accounting 
researchers to a further examination of these 
issues. 
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CLINICAL BUDGETING: EXPERIMENTATION IN THE SOCIAL SCIENCES: 
A DRAMA IN FIVE ACTS* 


T. PINCH, M. MULKAY and M. ASHMORE 
Department of Sociology, University of York 


Abstract 


Clinical budgeting systems are increasingly being introduced into the British National Health Service. This 
paper examines in some detail the testing of one particular budgeting system. It discusses the aims, 
execution and evaluation of the test. The paper is written as a play partly for reasons of clarity and 
entertainment but also and, more seriously, to reflect recent concerns in the sociology of scientific 
knowledge whereby attention is drawn to the parallels between analysts’ and participants’ attempts to 


render a definitive view of the social world. 


DRAMATIS PERSONAE 


Researchers One, Two and Three: Three 
sociologists of science who are researching into 
the practical application of health economics. 

A Tape-recorder: A small portable recorder 
(perhaps a Sony TCM 9) which plays tapes of a 
health economist being interviewed by 
sociologists. 

A Video-recorder: A VHS recorder with monitor 
which plays tapes of health economists teaching 
clinicians health economics at a special week- 
end course. 

Kathleen: A health economist working within 
the NHS. 

Don: A health economist working in a university 
applied research unit. 

Iden Wickings: The Director of the King’s Fund 
CASPE research unit. 


Throughout the play the words of the fictional 
Researchers (One, Two and Three) have been 
made up except in Act IV where they are drawn 





from an interview transcript. The speeches ofall | 
other characters (except the Tape-recorder in 
Act IV) are taken verbatim from transcripts and 
texts collected by the authors in the course of 
their research in the sociology of health econ- 
omics. 


ACT I: AN IDEA IS BORN IN A LONDON CAFE 


It is about one year into the research project 
on the extension of economic reasoning into the 
area of health care. Two of the researchers are 
seated in a cafe in London discussing, over a cup 
of tea, how the project is going. They have just 
carried out an interview with a health economist 
who works at the nearby King’s Fund Hospital 
Trust. There is a tape-recorder on the table. As 
the researchers talk they play back parts of the 
interview they have just recorded. 


Researcher One: Well that seemed to go okay, it 


“The play is based on a wider research project that is concerned with the difficulties and dilemmas that health economists 
face in the practical application of their knowledge, a full report of which will appear in Ashmore et al. (1989). We would 
like to thank Iden Wickings for his time and for commenting on an earlier draft. The research was funded by the ESRC (grant 
433250004} under its “Science Studies and Science Policy” initiative, phase one. 
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was friendly and relaxed and we got lots of anec- 
dotes and good quotes we can use. 

Researcher Two: Yes, he certainly was talkative. 
I think it definitely helped that you knew him. 
How did you get to meet him? 

Researcher One: On the cricket field. 
Researcher Two: You're kidding. 

Researcher One: Not at all. When I was at High 
Tech University we had this departmental 
cricket team and every summer we would go on 
a cricket tour to play Knowex University’s Econ- 
omics Department. When I first met him he was 
playing for them. When I say “playing”, that is a 
bit of a joke actually; he was the worst cricketer 
I ever met, even worse then me. I always 
remember my first sight of him. He was a legen- 
dary character and he wore this ridiculous 
floppy sun hat. I was sitting on the boundary 
waiting to bat and someone skied a catch to- 
wards him and sure enough he dropped it. Well 
it turned out he was a great drinker and racon- 
teur. We've kept in touch ever since, but we’ve 
never talked about his career in health econ- 
omics until now. What I found to be intriguing 
about the interview is that while we were having 
all that fun on the cricket field, the poor bloke 
was going through an existential crisis concern- 
ing his faith in economics. I had no idea. 
Researcher Two: Yes, that was interesting. Let’s 
listen to that bit of the interview again, shall we? 
We've got plenty of time before we need to leave 
for our train. 

Researcher One: Okay, [ll just rewind the tape. It 
was somewhere near the beginning as I recall. 
Let’s try it here. 

Tape-recorder: ... we were interviewed on 
October the First and paid from October the 
First. Essentially the University of York had got 
some money from the DHSS for two projects, but 
it had come through a bit late, later than ex- 
pected, possibly they hadn’t got their act to- 
gether, I don’t know. And they certainly weren’t 
getting much in the way of applications, so the 
fact that I was around was appealing. And I 
couldn’t resist risk aversion, three years’ money 
— good money — flat on the campus, so I caved 
in and got married. Did teaching hospital costs 
for three years or so ... 
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Researcher One (stopping tape): I think it was 
after this stuff about how he worked at York, but 
let’s listen on here for a while. 
Researcher Two: I can’t believe how easy it was 
to get a job in those days — the only applicant. 
The last job I tried for in a sociology department 
had two hundred applicants. 
Tape-recorder: ... a typical York project. That’s 
to say thought up on the train getting down to 
DHSS. No I’m being unkind of course. A rela- 
tively sketchy protocol though. Let’s put it that 
way. Not the sort of detail I found later in my 
career at St Doubtings ... We produced a report, 
we analysed some numbers, but I felt looking 
back on it now, the thing was on tick-over most 
of the time — we played a lot of sport in York. 
There’s a tape running, what am J doing? 

Cricket? You didn’t play much cricket in 
those days? 

Cricket not much, a lot of table tennis, 
dominoes... 3 
Researcher Two (stopping and advancing 
tape): Do you ever stop talking about cricket? 
Let’s get on to where he leaves York for Knowex. 
Tape-recorder: ... the idea was that some of the 
work I was doing on teaching hospitals would 
become a D.Phil. And really what got my career 
going was an aspect of a formula that preceded 
the RAWP [Resource Allocation Working Party] 
formula for dishing out money... I got in- 
terested in that for no obvious reason. It was a bit 
peripheral to our work on teaching hospitals and 
I cranked out a paper on that which then eventu- 
ally got published in Applied Economics . . . That 
was enough to get a job at the University of 
Knowex. Straightforward lecturer in economics 
in 1974. And I had four years’ bashing away 
there, undergraduate tutorials but keeping up 
the health thing... i 
Researcber One (advancing tape): A proper lec- 
turing job with only one publication and no 
Ph.D. Incredible. In sociology, even in the 
1970s, you needed a Ph.D. and two books for 
what few jobs there were. 
Tape-recorder: | moved to St Doubtings in "78, 
really becoming increasingly fed up with econ- 
omic theory. Not so much with health econ- 
omics... I still went along with a lot of Alan 
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Williams’ line in those days. I still thought that in 
the practical fields one could do something with 
economics fairly well. But I was disillusioned 
with mathematics and economic theory, which 
is, it’s essentially a game, whatever comes out is 
what you put in... I remember having a sort of 
crisis of confidence, worrying that I wasn’t doing 
the best for my students . . . I got so frustrated by 
it. It seemed so pointless to me... I actually 
started reading a copy of the latest edition of the 
American Economic Review and working 
through the articles... You know that was a 
complete waste of time: what have we disco- 
vered? 

Why didn’t you get your disillusionment 


` earlier? 


I was extremely seduced by economics... I 
first went to York, there I was, it was well taught, 
Williams and, I don’t know whether you get to sit 
in on their lectures and so on, but Williams and 
Culyer are very clear teachers. I think they’re 
very persuasive and I really, I swallowed the 
whole story. You know, if only the world was 
like Alan Williams, if only everyone analyzed 
their decisions in this way, wouldn’t it be a bet- 
ter place? So I think there is a nice internal logic 
about economics which helps, like a crossword 
puzzle — the different pieces fit together — 
whereas something like accountancy has a set of 
rules, largely arbitrary ones... 

Researcher Two (pausing tape): Poor old 
accountants, they get it in the neck every time 
from the economists. But this is good stuff. We 
can use this in our book, especially the bit about 
if only the world was like Alan Williams wants it 
to be. 

Tape-recorder: I began to ask myself. . . I had te- 
nure at Knowex.. do TI really want to spend the 
rest ot my ute pounding away at this stuff, when 
it’s clearly not getting anybody anywhere. And I 
think that’s when I really started to get fed up... 
I mean something I was using in lectures came 
out in a very prestigious journal as a comment. 
There is just some obvious side point that I'd 
been using in lectures, it never occurred to me 
to publish it because it was so obvious... And 
that contributed to my disillusionment . . . There 
was a famous article ... it was about bargaining 


and where economics missed out on things, like 
concentrating on a simple fair trade model... 
[this] paper said that if you were going to trade 


-once with somebody... you can trade good 


commodities or bad commodities... then on. 
the whole you would give the guy the duff com- 
modity, because on the whole you assume that’s 
what he is going to give you and your gains are 
maximized if you give him the duffone .. . A pro- 
fessor of economics at Sussex wrote a paper 
pointing out that if there was an expectation of 
continued trade between the parties, then on 
the whole they would be nicer to each other. 
That’s to say, you don’t want to turn a customer 
off with a dud the first time, so you give him a 
good one. Now my wife, who had just given up 
work; said, “You know, since I've been going to 
that greengrocer around the corner regularly, I 
get much better stuff than when I used to pop in 
occasionally on my way back from school”. I 
thought now, if that is such an obvious bit of 
common sense that my wife who is no 
economist but a perfectly sensible citizen, if 
that’s something that can be spotted in that sort 
of way, then why are we publishing that sort of 
thing in the Quarterly Journal of Economics? 
Researcher One (stopping tape): Presumably to 
get a job as an economist! Anyway you've got to 
admire him for having the courage of his convic- 
tions, giving up a university position to work as 
a health economist in an applied context. But, 
surprise, surprise, when he got there he found 
that there wasn’t much that he could do either, 
because it turned out that issues were settled 
more by local political interests than by econ- 
omics. There was a good bit on this later on. (Ad- 
vancing tape.) Listen to this. He's talking about 
a health economics study of mobile X-ray sc- 
reening which showed that the costs far out- 
weighed the benefits because of the low num- 
bers of cancers and TB cases spotted. 
Tape-recorder: | mean looking for six of anything 
per thousand [screened] just doesn’t seem like a 
good piece of public money. 

So its that kind of decision that you feel 
bealth economics could belp with? 

I used to, I think certainly yes when I 
moved ... I hoped that the sort of research that 
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I was doing would help. 

But it didn’t? 

Well it doesn’t... it’s only more recently that 
I've come to the conclusion that it doesn’t help; 
partly because of its failure to take on board the 
different interest groups... and I’m annoyed 
with myself for not seeing it earlier. I mean how 
could I? ’'m annoyed with my own naivety if you 
like. How could I have ever believed that you 
would walk right into a meeting of different in- 
terest groups and say, here are the costs of doing 
this, and here are the benefits, and here are the 
courses of action, and we’ve decided that bene- 
fits exceed costs by most in this one, so will you 
all please agree. I mean it seems to be mad look- 
ing back, that it’s incredibly stupid of me to ever 
fall for that... 

Are you saying that a perfect world, or a bet- 
. ter world would involve being able to go round 
a table, and showing them the figures, and 
coming to an agreement on the basis of... ? 

Not any more. I used to think that. Now, I 
think that there’s no such thing as a better world. 
I mean there are just different worlds in which 
different people manage to secure more ofa ser- 
vice that they’re interested in. And I fully under- 
stand now why those people defended the mass 
miniature X-ray service. If they were reassured 
by it, then okay, fair enough, that’s a legitimate 
reason for trying to hang on to the service... 
Researcher Two: He seems to have become in- 
creasingly disillusioned with health economics. 
What’s he doing now working at the King’s 
Fund? . 

Researcher One: They offer courses to clinicians 
and health service managers. That’s why Kath- 
leen was there. 

Researcher Two: | got a hell of a shock bumping 
into her in the lift. I just didn’t expect to meet 
another of our respondents there. 

Researcher One: You did very well to remember 
her name. I recognised her from an HESG 
[Health Economists’ Study Group] meeting but 
didn’t have a clue who she was. Rewind the tape 
a bit, there was a very funny story at the start 
about consultancy. Before you arrived he got a 
phone call from the World Bank in Washington. 
They wanted to fly him over for a two-day con- 
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sultancy. (Researcher Two rewinds tbe tape.) 
Tape-recorder: .. . itis all piggy backed onto Grif- 


fiths — management, consultancy, advice... 


someone pays for a group of managers to come 
in and talk about their problem . . . a sort of mix- 
ture of consultancy, therapy, customized teach- 
ing. 

How much bealth economics enters into it? 

Not much. I do some sessions particularly on 
economic type things, but they’re more really 
health service finance. I do stuff on clinical 
budgeting, it is something that all the managers 
want to know about... Well I don’t get phone 
calls from the World Bank all the time — just an 
accident. One of the reasons I’m — I wouldn't 
say I was successful — one of the reasons I'm 
popular is because I almost never say no... I've 
done a little bit of background work for London 
Weekend TV a couple of times for cas — cash in 
hand. They wanted some big numbers. What will 
AIDS cost in 1993 if we don’t act now? Like a lot 
of economics you collect some numbers on cost, 
unit cost and you collect some numbers on num- 
bers of units and you sit there and multiply them 
and out comes your fee... With London Week- 
end TV it was particularly nice because one of 
the guys providing the numbers kept changing 
his mind. So they were on the phone to him for 
ages chatting and I was sitting there, at the other 
end of the phone with this London Weekend guy . 
sweating. And I was just sort of, he thought I was 
working out the numbers, I was working out my 
Researcher One (laughing): Wonderful. But lis- 
tening to the tape makes me feel depressed. 
Researcher Two: Because you never get any con- 
sultancy? 
Researcher One: That’s true, but no, I was being 
serious for a moment. It’s such a mess this health 
economics study. These people are just out 
there in the messy real world and it’s a problem 
making sense of it all, trying to get a handle on 
things. I mean how do we treat this last inter- 
view? Was it with someone who works “inside” 
or “outside” the health service? Is he even a 
health economist at all as he seems to have given 
up serious research in health economics? 
Researcher Two: Well most of the health 
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economists we have talked with have claimed 
not to be proper health economists. That seems 
to be part of the character of health economics 
— it’s either what you used to do or what some- 
‘one else is doing. l 
Researcher One: But then how are we going to 
conclude anything? In a way, his current view, 
which is that political interests are more import- 
ant than economics in understanding the NHS, is 
closer to our own view, but if he has ceased to be 
an economist at all, what weight can we give to 
his views? It wasn’t like this when I studied 
physicists. You never got physicists saying, “Well 
actually Pm not really a physicist” or “I gave up 
physics because it was no better than common 
sense”, or “I now see that physics is all about 
social and political factors”. It was all so clear 
cut. You found your experimentalist who claims 
to have observed a new phenomenon of the 
natural world and then you found a second who 
disagrees with the first and sees something com- 
pletely different. And then you would interview 
both of them, along with their supporters and 
other protagonists. Finally you showed there 
were good arguments on both sides, hey presto!, 
in comes the social world to settle matters. It 
was such a neat and tidy thing to do — all in the 
context of a tight technical argument over a 
small set of experiments as Pinch (1986) did in 
his recent book. I really should have stayed with 
physics. Now we have these health economists 
who don’t do experiments, and don’t even claim 
to be health economists, and we waffle away for 
hours with them about the myriad problems of 
the National Health Service [NHS] with a bit on 
their career, a bit on QALYs [Quality Adjusted 
Life Years], a bit on option appraisal, a bit on 
measurement, a bit on rationality and so on. You 
never know whether they really know what 
they’re talking about or whether they’re just 
talking off the tops of their heads. How on earth 
are we going to make sense of it all? 

Researcher Two: (really do think you are roman- 
ticizing a little about the sociology of the natural 


sciences. Take another read of some of the . 


recent stuff by Gilbert & Mulkay (1984), 
Ashmore (1988, 1989), Pinch & Pinch (1988) 
or Mulkay (1984, 1985) on his own. But that 
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aside, surely the health economists’ world is no 
more and no less messy than ours as sociologists 
of science? Take the origins of this project. As I 
recall, we decided on health economics as a 
topic after you phoned a health economist friend 
at High Tech who told you about the HESG and 
since you only had two days in which to write 
the proposal, you thought “let’s study this 
because at least we have a convenient sampling 
frame”. Typical back-of-the-envelope type thing 
I seem to recall. 

Researcher One: Well let's forget about that and 
be thankful the tape recorder isn’t on! 
Researcher Two: You also keep saying that so and 
so isn’t a proper sociologist of science and that 
you yourself have changed your views on the 
sociology of science as you have moved 
between different research locations. And look 
at all the changes in emphasis we have had dur- 
ing the course of this project. We don’t do ex- 
periments and we seem to manage alright; so 
why is it any worse for the health economists? 
Researcher One: I see reflexivity is rearing its 
ugly head again. Rather than go into all those 
issues which are dealt with in Woolgar’s (1988) 
new collection, why don’t you find me a nice 
clean-cut area of health economics, rather like 
scientific experiments, which I could feel com- 
fortable about studying. 

Researcher Two: How about clinical budgeting? 
Researcher One: Oh you mean the thing that all 
these managers want to learn about and which is 
taught at the King’s Fund? It’s some sort of 
financial decision-making system to enable clini- 
cians and managers to manage more efficiently 
isn’t it? 

Researcher Two: That’s right and many health 
economists we have talked to seemed to be 
enthusiastic about it. But, as usual, the people 
involved sometimes deny being health 
economists. 

Researcher One: But how is clinical budgeting 
like science? Surely budgeting is what you and Į 
do when we come down to London and we have 
to decide whether we can afford British Rail 
sandwiches or a meal out on the research pro- 
ject. : 

Researcher Two: I think the health economists 
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would say there was more to it than that. But on 
the train down I read this interesting article 
about experiments on clinical budgeting. It was 
in a health economics journal with an editorial 
by Tony Culyer. 


Researcher One: Tony Culyer, he’s one of the 
leading lights in health economics at York, right? 


Researcher Two: Yes, that’s the person. I have the 
journal in my briefcase. (Researcher Two fum- 
bles under the table for bis briefcase and pulls 
out a battered copy of the journal Nuffield/York 
Portfolios He opens it and starts to read out 
loud.) “Despite the three recent major organiza- 
tional changes in the National Health Service the 
most striking features that continue to charac- 
terise its management are the absence of variety 
in experimentation in alternative ways of get- 
ting things done. .. This folio reports on what is 
the one outstanding exception to these deficien- 
cies: some real experiments in offering clinicians 
budgetary incentives to be better managers. 
Their importance jis scarcely to be 
underestimated, given the uniqueness of such 
ordinary experiments in Britain. Iden Wickings 
and James Coles make the ethical case for clini- 
cal budgeting in the NHS and show how it links 
up with new developments in the provision of 
information for management at all levels” 
(Culyer, 1985, p. 1). Well this seems to be all 
about experimentation so why not have a look at 
it? 

Researcher One: Pass it over. (Researcher Two 
bands over the journal and Researcher One 
starts to read.) ït certainly does sound as if this is 
what I have been looking for. Wickings & Coles 
say that “There have now been many clinical 
budgeting experiments in Britain” (Wickings & 
Coles, 1985, p. 4). And this guy Wickings seems 
to have been involved with most of them. It says 
here that there were some very recent experi- 
ments carried out by Wickings which were 
highly influential in persuading the Griffiths 
Inquiry into health service management to advo- 
cate the introduction of management and clini- 
cal budgeting. They report that “A more basic 
method of reaching... agreement [between 
clinicians and management] has recently been 
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tested in some clinical budgeting experiments. 
The method involved district managers and 
clinicians negotiating Planning Agreements 
with Clinical Teams (PACTsY. (Researcher 
One giggles as be recognizes the awful pun.) “In 
early 1985, an independent Evaluation Group 
chaired by Professor Buller, the previous Chief 
Scientist at the DHSS concluded, ‘The evaluation 
group is not aware of any other system than 
PACTs that offers similar interaction between 
managers and clinicians’ (Wickings & Coles, 
1985, p. 7). Blah, blah, blah. It seems that these 
experiments were a success. It goes on to say, 
“The Evaluation Group is unanimously of the 
view that in principle this PACTs-centred 
budgeting system has all the right ingredients for 
improved resource management in the NHS and 
it should be given the support needed to ensure 
its wider dissemination within the service” 
CWickings & Coles, 1985, p. 7). This is great. I 
will have to get hold of that report on these suc- 
cessful experiments. Perhaps there are some cri- 
tics somewhere we can track down. We might 
even have an experimental dispute as in physics. 


Researcher Two: Are you happy now? 


Researcher One: Well happiness is asking too 
much. But at last we are going to be able to de- 
construct some real science instead of all these 
pseudo-scientific measures such as QALYs and 
the like which no one takes seriously. (Looking 
at watch.) Good grief! Look at the time. Come 
on. We'll have to shift if we want to catch the 
5.30 train back to York. 


Researcher Two: Aren’t you forgetting one thing? 
Researcher One: What’s that? 


Researcher Two: We said we would work out 
how much of the research budget we had left to 
spend before deciding whether we could afford 
a meal on the train. 


Researcher One: That’s right, but we've got no 
time to do it now. We’re bound to have some 
money left in the kitty and there’s no point in 
slumming it. I feel it’s been such a productive 
day. I even think we might stretch to a bottle of 
wine on the train back. (They hastily pay for 

their cup of tea and leave.) i 
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ACT I: WHAT IS CLINICAL BUDGETING? 


Researcher Two is seated at his desk at the 
University of York staring at a video monitor. It is 
paused on a frame showing a health economist 
gesticulating at a large table of figures drawn on 
a blackboard. Researcher One enters carrying an 
envelope file of papers under his arm. 


Researcher One: Morning. Mind if I join you fora 
bit? 

Researcher Two: Not at all. ’'ve been working on 
a chapter of our book. It’s on the topic of option 
appraisal and I’ve been trying to read some of the 
figures from an option appraisal Don presented 
at that clinicians’ course which I videoed. It’s 
starting to give me a headache and anything for 
a break. 

Researcher One: Health economics again I’m 
afraid. 

Researcher Two: Clinical budgeting? 
Researcher One: That’s right. I got hold of that 
CASPE report, you know the one in which Wick- 
ings and his team present the results of their 
experiments. That document is dynamite. I can’t 
really believe what I’m reading, so I just want to 
go through a few of the points with you to make 
sure I’ve got it right. I’ve started to write it all up 
for a talk I’m meant to be giving at Brunel Univer- 
sity. (There is a knock at the door. Researcher 
Three enters.) 

Researcher Three: Mind if I join you? I need a 
break from administration. 

Researcher One: Sure. We were just about to go 
through some of my material on clinical budget- 
ing. I’m glad you’ve popped by because I’m draft- 
ing something to present to a seminar at Brunel 
University and this is an ideal opportunity to try 
it out on you both. There may be some health 
economists in the audience and I want to make 
sure I’ve understood the basic idea of clinical 
budgeting. Can I read you the start of the paper? 
Researcher Three (sitting down): TIl imagine I'm 
an economist. Go ahead. 

Researcher One (taking a type-written manu- 
script from the file be reads out loud): “Clinical 
budgeting and its close relation, management 
budgeting are financial decision-making systems 


which are intended to give users of health care 
resources, and in particular clinicians, a greater 
degree of choice over how resources are allo- 
cated such that, overall, resources may be used 
in a more efficient way” (Pinch et al., 1987, p. 
15). 

Researcher Two; Sorry to interrupt you, but what 
justification is there for treating management 
and clinical budgeting as the same thing? 
Researcher One: I’m glad you asked me that. As 
far as I can see they are essentially the same thing 
from the point of view of economics except that 
the emphasis in the two systems is slightly differ- 
ent. Overall they are both ways of planning a 
budget so as to make clinicians and managers 
more aware of costs and thus more efficient. In 
clinical budgeting the prime target is clinicians. 
There are powers of virement... 

Researcher Two: Virement? What on earth’s that? 
Researcher One: It’s the ability to transfer a 
surplus from one category to balance a deficit 
under another head. If a saving is made, the 
money can be spent on something else the clini- 
cians think is desirable. Clinical budgeting has 
virement as a direct incentive to clinicians. Man- 
agement budgeting, on the other hand, doesn’t 
offer clinicians the same powers of virement. 
There are also differences in the ways the cost- 
ing is done. In management budgeting costs 
include overhead costs such as rates or the cost 
of running the boiler house. The costing infor- 
mation is generally less accurate. 

Researcher Two: They sound rather different to 
me and I’m not certain you are justified in lump- 
ing them together. Perhaps you should look at 
the ways in which for some purposes they are 
treated as the same and for other purposes as dif- 
ferent. 

Researcher One: Look, as usual you are trying to 
be too sophisticated. I want you to react as an 
economist might react. From the economist’s 
point of view they are the same thing. I envis- 
aged this might be a problem, so just to back me 
up, I found this article by Wickings in the Health 
and Soctal Service Journal where he addresses 
precisely this issue. (Researcher One takes 
another paper from bis file.) Let me quote you 
the man himself: “Are management budgets dif- 
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ferent from clinical budgets? Clinical budgets 
have been under test in several countries for a 
number of years. In the NHS the CASPE Research 
Unit now has three experiments in progress... 
The clinical budgets now in use have many fea- 
tures in common with management budgets 
although there are a few differences” (Wickings, 
1983, p. 466). He then goes on to say what 
exactly these differences are: “To summarise, 
the differences are small although they possibly 
could be significant. None the less -the 
approaches are sufficiently similar for most of us 
to ignore the finer distinctions” (Wickings, 
1983, p. 467). I take that as warrant to ignore the 
differences for the purposes of introducing clin- 
ical budgeting in my paper. 

Researcher Two: Yes, but as we know, Wittgen- 
stein said that all similarity and difference judge- 
ments are accomplished by us and similarities 
and differences don’t reside out there. 
Researcher One (exasperated): Of course, 
everything may seem either different or similar, 
but that is the kind of nuance which I just don’t 
want to take up here. 

Researcher Three: If | may interrupt. It seems to 
me that both of you are right. 

Researchers One and Two: That’s really helpful 
Researcher Three: Let me explain. It is clearly the 
case that any two things can be seen to be either 
similar or different. But whether you do “similar- 
ity work” or “difference work” depends on the 
practical occasion at hand. Presumably, for the 
practical purpose of presenting a paper to 
economists, you should treat management 
budgeting and clinical budgeting as the same 
thing. Similarly, for the practical task of his 
article, Wickings was warranted in treating the 
differences as being negligible. But it is also quite 
correct to point out as he did that for other pur- 
poses the differences could become quite cru- 
cial. 

Researcher One: Well, given the preference to 
seek agreement in conversations, this is prob- 
ably a good point for me to return to reading my 
paper. I only want you to judge whether health 
economists would find my account plausible, 
that’s all. 

Researcher Two: In other words, you want us to 


T. PINCH, M. MULKAY and M. ASHMORE 


stop raising all the interesting issues. But as that 
is what economists also seem to want you’re 
probably on the right track. 

Researcher One (reading): “In the context of 
hospitals, where clinical budgeting is initially 
being introduced, rather than Health Authorities 
making a yearly allocation of resources to func- 
tional budgets — so much to pharmacy, so much 
to radiology, and so on — budgets will be allo- 
cated to each major area of clinical activity. This 
means that individual clinicians (and ward 
sisters) will have a greater part in deciding the 
resource allocation for the budgetary year. Clini- 
cians need to be provided with information on 
how much different components of clinical 
activity cost (e.g. the costs of an X-ray, of a test 
done in-a pathology laboratory, and so on) and 
on'how much of their budget they have spent. 
This information is provided by new computer 
systems” (Pinch et al., 1987, p. 15). 

That all seems perfectly straightforward I 
hope. “Clinical budgeting is held to be a way of 
achieving a more “rational” and “efficient” dis- 
tribution of scarce resources such that ulti- 
mately patient care will be improved. Underly- 
ing the new decision-making systems is the view 
of social behaviour which is prevalent in econ- 
omics, and it is no accident that the leading 
proponents of clinical budgeting within the U.K. 
have been health economists. According to 
economists, given scarce resources, individuals 
make choices in which they trade-off the costs of 
some action against benefits such as to maximize 
the benefits for themselves or for some group 
they purport to represent. The problem ‘for 
health economists is that in this case the alloca- 
tion of scarce resources cannot be mediated by 
the usual mechanism of market prices. This is 
because it is held that health care should not 
depend upon the ability to pay. Health 
economists are thus forced to search for surro- 
gates for market prices. One way round the diffi- 
culty is to treat the consumption of resources by 
groups other than patients — such as clinicians 
— as the mediators of market forces. This is, in 
effect, the basis of the economic rationale for 
clinical budgeting” (Pinch et al., 1987, pp. 16— 
18). 
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I then give a little quote from a health 
economist supporting this underlying economic 
rationale. I go on to say: “The argument is that 
the system as a whole will operate most effi- 
ciently when benefit is being maximized by 
clinicians — clinicians in this case act, as it were, 
on behalf of their patients. As Wickings & Coles 
write, “... the clinicians can be given extra dis- 
cretion and thus have an incentive to use their al- 
located resources more efficiently in the in- 
terests of their own clinical service and their pa- 
dents. In this way optimizing the output of the 
NHS, in terms of quality and quantity of the ser- 
vice provided” (Wickings & Coles, 1985, p. 3). 
In short, the rationale underlying clinical 
budgeting is none other than the standard route 
which economists offer for reaching Nirvana 
(that perfectly rational society in which the 
greatest good to the greatest number is pro- 
duced by individuals trading off costs and bene- 
fits and maximizing benefit)” (Pinch et al., 1987, 
pp. 18-19). That’s it so far. Does that seem okay? 
Researcher Two: I’m sure you're right that health 
economists do advocate clinical budgeting — 
our interviews support that. But I still have wor- 
ries about you presenting a definitive version of 
clinical budgeting as if it was all based upon 
economic principle. I was looking the other day 
at the video of the talk on management budget- 
ing which Kathleen gave at the course for clini- 
cians, As I recall, she says hardly anything at all 
about economic principle. Instead, she put it all 
in terms of the practical problems which man- 
agement budgeting helps to solve. 

Researcher Three: Is that on the same tape you 
have just been looking at? 

Researcher Tw Yes it’s after this option apprai- 
sal stuff. Shall we have a look? (Advancing tape.) 
Researcher One: I really want to get onto the 
testing of clinical budgeting as soon as possible. 
That’s the bit that interests me because it’s most 
like physics. 

Researcher Two: This is the sort of thing I mean. 
Here is Don, who chaired the session, introduc- 
ing Kathleen.' 





Video-recorder: ... it’s also relevant, of course, 
because it’s actually being imposed to a consid- 
erable extent on the service. And so exactly 
what is the sort of experience that have, there 
have been so far? They are both highly relevant 
questions and theyll be questions to which 
Kathleen will be addressing herself this morni- 
ing. 

Researcher Two (stopping and advancing 
tape): See what I mean? He seems to be em- 
phasizing the practical side of knowing about 
something that is going to come into force any- 
way. Kathleen’s introduction is also in terms of 
her practical experience with management 
budgeting. Listen to this: 

Video-recorder: Every district has to commence 
implementation... and we went out to district 
and worked with districts and helped them 
implement management budgeting . . . So if your 
district is starting on the path of implementing 
management budgeting, these are the sorts of 
area you might find yourself being involved in. 
Researcher Two (stopping and advancing 
tape): Then she uses all the rhetorical devices 
we are familiar with from our “Colonizing the 
Mind” paper (Mulkay et al., 1987). She dis- 
associates management budgeting from cost cut- 
ting and, of course, it’s nothing to do with 
accountancy. —_ 
Video-recorder: This is very very important. 
Management budgeting is not a costing system, 
it’s not a glorified cost accountancy system, it’s 
about management, managing resources... 
Researcher Three: That’s a lovely example of a 
three-part list with a contrast. In other words, 
two things which management budgeting isn’t 
are presented, followed by a third thing which it 
is. And look how each point is accompanied by a 
downward arm movement to add emphasis. She 
is using all the skills of political rhetoric which 
Max Atkinson (Atkinson, 1984) documents. 
Researcher Two (advancing tape): Later on she 
claims management budgeting is all about help- 
ing patients: 

Video-recorder: Also it’s quite specifically 


‘all the excerpts quoted here are taken from a talk given at a course for senior clinicians entitled “Effectiveness and Efficiency 
in Patient Care” held at Bowness-on-Windermere, Cumbria, England, 17—18 March 1986. 
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patient related, in other words you're looking at 
the cost of patients, you’re looking at the sort of 
things you can deliver to patients, it makes sense 
to the consultants, to the doctors, it also makes 
sense to the nurses, because under management 
budgeting we actually have a system of ward 
budgets and consultant budgets. 

Researcher One: It seems as though it is designed 
to help just about everyone. 

Researcher Two (advancing tape): And the 
implications are only to be felt at the margins. 
Video-recorder: And a lot of — in my experience 
— a lot of consultants actually, are happy with 
what they're doing now and they just want, you 
know, want to make sure they’re not going to get 
squeezed. But you know they just chug along 
and maybe in a few years time they’ll make some 
more changes . . . I mean I’m sure that Don is say- 
ing what you know, that most of the changes are 
at the margins. A great body of your costs are 
fixed, it is quite difficult to change... 
Researcher Three: It seems to be rather like the 
distinction between the “strong” and “weak” 
programmes of health economics which we out- 
lined in our paper “Colonizing the Mind” (Mul- 
kay et al., 1987). Kathleen and Don when they 


talk to clinicians present management budgeting 


as something which will help them do what they 
do already a little better. It involves no radical 
change and affects things only at the margin. It is 
very much in the vein of the weak programme. 
Researcher One: It’s what we might call a “user 
friendly” system. 

Researcher Two: That’s a good metaphor. After 
all, one of the main features of clinical budgeting 
is its use of computers which doctors fear will 
feed yet more useless information into the NHS. 
Kathleen addresses this point specifically (re- 
winding tape). 

Video-recorder: ... it sounds horrible. I’ve just 
been on a planning course at the King’s Fund last 
week and we've sort of had all this wonderful 
management jargon thrown at us, you know and 
you can open your mouth and out it comes... 
Researcher Two (stopping tape): Sorry that’s the 
wrong place. 

Researcher One: That must have been the course 
at the King’s Fund which she was attending 
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when we met her a few weeks ago. I’m really 
starting to feel part of this network. 

Researcher Two (rewinding tape some more): I 
think this is the right place now. 
Video-recorder: ... the point is, you must have a 
budget statement that is accessible to you, that is 
interesting to you, that gives you the sort of 
information you want. 

Researcher Three: Does she ever stop using 
three-part lists? : 
Researcher Two (advancing tape): Her talk is 
full of them. Here is another where she argues 
that-what consultants want is more flexibility — 
a flexibility which of course management 
budgeting provides them with. 

Video-recorder: ... and as consultants we want 
the ability to get our hands more on the budget, 
we want more flexibility, we want to be able to 
actually change more within what were 
doing... 

Researcher Two: This version of management 
budgeting seems to be a long way away from the 
attempt at radical change in clinicians’ 
behaviour which you propose at the start of your 
paper. (Advancing tape.) Here is just one last 
statement from her as to what it is. This is prob- 
ably the weakest version of all. 

Video-recorder: Management budgeting isn’t a 
panacea; management budgeting isn’t going to 
solve your problems. What it is, it’s a searchlight 
on the management problems. .. 

Researcher Three: I don’t believe it, another 
three-part list and contrast formulation. 
Researcher One: Yes, I’m sure he’s selecting the 
data so that we only listen to lists and contrasts. 
Researcher Two: Not at all, but I must admit 
when I went through the tape the other day I 
marked the places on the counter which I 
thought might interest us. 

Researcher Three. If 1 may summarize. There is a 
radical economic rationale or “strong pro- 
gramme” of clinical budgeting which we have 
used at the start of the Brunel paper and then 
there is a “weak programme” of trying to help 
clinicians with their problems and attending to 
their misconceptions about these kinds of 
budgeting systems. And it is this latter version 
which is presented by Kathleen in her talk. 


Zo... 
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Researcher Two: Maybe she takes a rather differ- 
ent line in her presentation at that recent HESG 
session on clinical budgeting. 

Researcher One: I really feel we should move on 
to the CASPE study, we seem to be getting side- 
tracked. 

Researcher Two: No, this is important because if 
Kathleen presents management budgeting in a 
different way there we can start to document 
how management budgeting is a flexible 
resource which actors present in different ways 
for the particular occasion at hand. 

Researcher One: Well, her talk at the HESG is not 
a very easy thing to analyze because as you may 
remember she spends most of her time counter- 
ing another health economist, Peter West, who 
raised eighteen different objections to clinical 
budgeting. 

Researcher Two: Do you by any chance have a 
copy of the relevant documents with you? 
Researcher One (searching in the folder): This is 
Peters paper called “Clinical Budgeting: A 
Critique”. And Vivienne has just finished the 
transcript of Kathleen’s response and the sub- 
sequent discussion. There may be a few words 
she didn’t get but as usual it’s good enough to 
work from. 

Researcher Two: How does Peter present clini- 
cal budgeting? 

Researcher One: Well rather in the same way that 
I presented it in my paper for Brunel. In fact it 
was Peter who was the health economist I 
quoted in the introduction to my paper. He said: 
“The central plank of clinical budgeting is that if 
the use of services was charged to a clinician’s 
budget, higher cost services would be reflected 
in a faster depletion of the budget, forcing con- 
sultants and other doctors to choose between a 
reduced level of activity and a reduced use ofre- 
sources for each case. This is precisely the model 
that economists use in examining consumer be- 
haviour in the market place” (West, 1986, p. 2). 
Researcher Two: But he’s criticizing clinical 
budgeting there, so how do you know that his 
version is the definitive one? 





1986. 


Researcher One: Well, as I said before, he simply 
puts forward the economic view underlying the 
whole thing, so I would have thought that it was 
largely uncontentious. 

Researcher Two: But the whole point is that you 
can talk about it in the way that Kathleen did 
without ever having to mention this economic 
rationale. 

Researcher Three: Perhaps we can settle this by 
seeing how Kathleen responded to Peter’s criti- 
cisms. 

Researcher One: Okay, but as I seem to re- 
member her response was pretty weak. She 
doesn’t seem to take on board any of his econ- 
omic arguments. But we're wasting time. The 
important thing is to move on to the testing of 
clinical budgeting. We don’t want to get bogged 
down in these nuances of presentation. 
(Researcher One passes transcript over and says 
testily:) But if you are so keen, you have a look at 
it. 

Researcher Two (reading through transcript): 
Well, here’s something interesting for a kick off. 
She says: “I mean we’re starting to get into prob- 
lems already because he seems to use clinical 
budgeting and management budgeting inter- 
changeably. Pve gone back to Griffiths and I 
think that we’re quite clear about what we're 
talking about”.” Then she quotes Peter as saying 
that “the main objective of clinical budgeting is 
to increase efficiency”. But according to her, 
“That is not the case. I mean if you go back again 
to your Griffiths then management budgeting is 
very much about management and it’s about 
accountability. Increasing efficiency is not in my 
view the main view of clinical budgeting”. In 
other words, there is a genuine dispute over the 
meaning of clinical budgeting or management 
budgeting, call it what you will, at a very funda- 
mental leveL In summarizing her comments on 
Peter she says: “. . . what he’s saying essentially is 
nothing really much to do with management 
budgeting ... he’s missing the point about man- 
agement budgeting in Griffiths, which is about 
management and accountability”. By the same 


*Quotations from “Kathleen” and from Peter West are taken from a transcript of the HESG meeting, University of Bath, 7 July 
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token you missed the point-about clinical 
budgeting in the introduction to your paper for 
Brunel. 
Researcher One: I can’t take Kathleen seriously. I 
mean, who is this “Griffiths” she keeps harking 
back to? Is he the Isaac Newton of health econ- 
omics or perhaps the Albert Einstein? No, he’s 
Roy Griffiths, a manager of a big supermarket 
chain! And no one at that recent HESG meeting 
we attended took her seriously either. Look at 
the transcript; Peter puts her down to devastat- 
ing effect (reading transcript): “Now Kathleen 
says it wasn’t intended to increase efficiency, but 
it was intended to increase accountability and 
responsibility. It seems to me that, I mean what 
is it, why are you trying to increase accountabil- 
ity if not to increase efficiency? Why are you try- 
ing to make people more responsible? You’ve 
overspent £5000 — congratulations (laugb- 
ter)”. 
Researcher Three: Notice the use the three- -part 
joke format, building up a puzzle ina list of three. 
It’s no wonder he got masses of laughter. 
Researcher Two: It seems to me that you’re 
taking Peter’s side against Kathleen and you sim- 
ply can’t do that because the “correct” view is 
what is precisely at stake. They are both reputa- 
ble health economists who are very familiar with 
clinical budgeting and to take one side would be 
to prejudge the issue. 
Researcher Three: Perhaps we can go back to this 
weak programme/strong programme idea. Did 
Kathleen’s version of clinical budgeting pre- 
sented to the HESG differ from that which she 
presented to the clinicians? ca 
Researcher One: As I've said, it’s quite hard to tell 
because at the HESG she was very much on the 
defensive in'response to Peter’s attack. How- 
ever, I did notice one startling change. If you 


recall when talking to the clinicians, she stressed 


that management budgeting was designed only 
to bring about changes at the margin. But she 
said exactly the opposite at the HESG. Listen to 
this: “I think another problem with management 
budgeting is just seeing it as being movements at 
the margin ... And again I think that misses the 
point as to what we're trying to do and what Grif- 
fiths is trying to do; he’s trying to look at the 
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totality of resource allocation and management 
of those resources. And I think that concentrat- 
ing at the margin just isn’t really what the core of 
the issue is. I feel it’s a very narrow view”. 
Researcher Three:. That’s interesting. Kathleen 
certainly seems tò go for a stronger version of 
management budgeting when arguing with 
Peter West at the HESG. 
Researcher Two: Which is exactly my point. If 
there is no one definitive version of clinical 
budgeting, how can we present one at Brunel? 
Researcher One: Listen, we have been through 
all this before. At Brunel I'll be talking to health 
economists not clinicians, so the “strong pro- 
gramme” version is the one that is appropriate. 
Researcher Two: I just find it odd that we as 
sociologists can feel happy about changing our 
versions of what clinical budgeting is about to 
suit different audiences. 
Researcher Three: But if the economists manage 
to do it, why shouldn’t we as sociologists also do 
it? But I think we're all getting tired and could do 
with a coffee. I feel it’s been a very productive 
session. 
Researcher One: Well that’s débaiable: Perhaps I 
can tell you about the CASPE tests of clinical 
budgeting over coffee. (Researchers all get up 
and leave office for coffee bar.) 
west 
wag ` i 
ACT II: THE PROBLEMS OF 
EXPERIMENTATION 


The three researchers are seated around a 
table drinking coffee. Researcher One has a 
document spread out on the table in front of him 
to which he refers as he talks. 

Researcher One: The results are contained in this 
report entitled Experiments Using PACTs in 
Soutbend and Oldbam HAs. HAs are, of course, 
Health Authorities. It’s written by Iden Wick- 
ings, Timothy Childs, James Coles and and Claire 
Wheatcroft and is produced by the CASPE 
research unit of the King Edward’s Hospital Fund 
— better known as the King’s Fund. 

Researcher Three: I know what the King’s Fund is 
but what does CASPE stand for? 

Researcher One: These health economists love 


me 
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their acronyms. CASPE stands for Clinical 
Accountability, Service Planning and Evaluation. 
It is a sub-unit of the King’s Fund. It seems to 
have been established by the Department of 
Health and Social Security [DHSS] as a separate 
unit to carry out targeted research such as that 
on clinical budgeting. 

Researcher Three: And just to fill me in, when 
was the report produced? 

Researcher One: It came out in December 1985. 
That is seven years after the research was funded 
by the DHSS in 1978. The project actually 
started in 1979. 

Researcher Three; Was it a large amount of fund- 
ing? 

Researcher One: Well it was enough for Wick- 
ings to head a research team consisting of three 
staff with nine additional research team leaders 
located in the field at different NHS Districts. 
That, on anyone’s reckoning, is a sizable opera- 
tion. The funding was, for instance, about ten 
times larger than the grant we've got for our 
research on health economics. Incidentally, I see 
that Kathleen is listed here as one of the research 
team leaders. 

Researcher Two: She seems to turn up every- 
where. 

Researcher Three: Are these experiments the 
first of their kind in the U.K? ` 

Researcher One: Yes, but a clinical budgeting 
system has been in operation at Johns Hopkins 
University Hospital in the States for the past 15 
years and several European countries are also 
experimenting with clinical budgeting. How- 
ever, given the peculiarities of health care sys- 
tems in different countries, such experiments 
haven’t played much part in the U.K. debate. 
Researcher Three; I see. Is this Wickings’ first 
shot at this type of experiment? 

Researcher One: Wickings took part in two 
earlier small-scale studies in this country. The 
first was at Westminster Hospital and was the 
prototype for his current work. Clinicians man- 
aged their own budgets and this led to some sav- 
ings being made. In his second study in Brent 
health district, rather than give the clinicians 
budgets, he only provided them with informa- 
tion on costs, and this proved to be less success- 


ful. Although, as we shall shortly see, what suc- 
cess means in this game is far from obvious. 
Researcher Three: And just to make sure I've got 
it absolutely clear, it is this study reported on 
here which influenced the Griffiths Inquiry to 
advocate what they called management budget- 
ing? 

Researcher One: Absolutely. The Griffiths 
Inquiry team visited Wickings’ experiments 
whilst they were still in progress and they were 
so impressed by what they saw that they recom- 
mended the implementation of management 
budgeting in their report. Now twenty so-called 
“demonstration districts” have been set up to 
further the implementation programme. Wick- 
ings has also continued to do his own follow-up : . 


studies and, in particular, one at Guy’s Hospital. eo E 


Researcher Three: Given what we said earlier ` 
about the strong and weak programmes of health': _ 


economics, how is Wickings’ report couched? Is 
clinical budgeting presented merely as some- 
thing to help clinicians overcome their practical 
difficulties or is a rather stronger brew offered? 
Researcher One: Well let me answer that by tel- 
ling you my own reactions to reading the report. 
As you know when I first got hold of it I was 
excited because it looked like real science. As its 
title indicates, it seemed to be all about experi- 
ments. The original proposal made to the DHSS 
was formulated to answer specific questions, the 
central ‘three being (reading): “(i) Can the 
Westminster/Brent budgetary system for consul- 
tants be established in very different districts? 
(ii) If so, what happens? (iii) What general con- 
clusions can be drawn?” (Wickings et al., 1985, 
p. 4). It was claimed that (i) and (ii) could be re- 
solved by, it says here, “direct observation” 
(Wickings et al., 1985, p. 5). The report as a 
whole is heavy with this sort of scientific 
rhetoric. Technical terms are defined carefully, 
it is written up in the format ofa scientific report 
with sections on “The Experiments in Outline” 
and “Results from the First Phase Experiments” 
and, as you can see here, it is full of graphs, tables 
and figures. There was, however, one oddity 
which I noticed. This is an early section of the re- 
port entitled “Evaluation”. 

Researcher Three: Research evaluation is quite a 
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standard thing in applied social science projects, 
particularly in the States. 

Researcher One: That may be so, but I was 
puzzled about why experiments needed exter- 
nal evaluation. Usually, the experimenter’s in- 
terpretation of results is all that is required. 
Researcher Three: Does it give any special reason 
why additional evaluation was felt to be desir- 
able? 

Researcher One: Well, part of the evaluation pro- 
cess involved getting the views of the partici- 
pants in the experiments, but the more interest- 
ing aspect was the setting up of a special Evalua- 
tion Group to monitor developments. It says 
here that “It was also proposed that the DHSS 
should itself establish an evaluation group, from 
which it will receive advice and a report” (Wick- 
ings et al., 1985, p. 5). l 
Researcher Two: Is that the same group headed 
by the Government Chief Scientist, Buller, 
which reported so favourably on the experi- 
ments and which was cited in that Wickings & 
Coles (1985) article in the Nuffield/York 
Portfolio — the article which got you started on 
this whole thing? 

Researcher One: Yes, that’s it. The group con- 
sisted ofa number of senior health service mana- 
gers, a professor of accountancy, a senior medic, 
a regional nursing officer and a regional medical 
officer. 

Researcher Two: In short, all the interest groups 
likely to be concerned with the introduction of 
clinical budgeting. 

Researcher Three: Apart from patients. 
Researcher One: Yes quite, but of course every 
interest group claims to speak for patients! Any- 
way, getting back to the Evaluation Group, at 
their first meeting held in October 1980, they 
decided that there might be a conflict of interest 
between steering and evaluating the project. It 
was therefore agreed that their remit should be 
evaluation only. It says, “The Group’s major role 
is to evaluate the outcome of the CASPE project 
~~ that is to say make a judgment as to whether 
the value of the clinical service planning and 
budgeting approach justifies the cost likely to be 
involved in setting up the budgetary arrange- 
ments. It is hoped that when the project is 
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completed the Group will be in a position to re- 


commend to the Department whether the ap-` ` 


proach should be commended for more general 
use in the service” (Wickings & Coles, 1985, p. 
6). That last point is highly significant because, 
of course, it is exactly that sort of recommenda- 
tion which the Griffiths inquiry team made. But 
the Evaluation Group had another role to play: 
“The Department recommended that if, in the 
final analysis, the Evaluation Group considered 
the clinical budgeting research to be of value, it 
would be essential to widely advertise the 
results, thereby allowing other districts to adopt 
a similar management style. The Evaluation 
Group would therefore have an important role 
to play in the dissemination of the research 
results...” (Wickings & Coles, 1985, p. 7). Not 
only were they evaluating the research and mak- 
ing recommendations for future policy, but they 
were also responsible for publicizing the find- 
ings. Provided, of course, the findings were posi- 
tive. I never encountered anything quite like this 
with the physicists I studied. It is as if, in a par- 
ticular area of science, the scientists, their fun- 
ders and the science media were all rolled up 
into one with the power to determine the future 
development of that area. 

Researcher Two: \t could be the case that in 
applied areas of science'this is the way things 
work. If you think of tests of new technologies 
such as a new aeroplane, it is so complex and 
there are so many interests at stake that there is 
bound to be some official evaluation process. 
Researcher One: Yes, that may be so. But what I 
don’t see is why the decision over whether or 
not the thing works has to be connected to its 
exploitation. First, you want to know if you have 
a genuine effect; then, if there seem to be practi- 
cal applications, you can seek funds from indus- 
try or support from the government and, if you 
feel you need it, you can always work up some 
publicity. The priority, however, miust be on the 
researchers’ right and ability to decide first 
whether the experiments work as claimed. 
Researcher Two: Distance really does lend 
enchantment doesn’t it? The less involved you 
get with research on physics, the more your de- 
piction of how things work there relies on what 
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you used to dismiss as old-fashioned views of the 
natural sciences. Work in the sociology of 
science, such as Latour’s (1987), for example, 
has challenged every one of these points. It just 
isn’t the case that ideas are developed in a “pure” 
context which are then taken up later for pur- 
poses of “application”. Political interests and 
media interests are often there right from the 
start, even in physics. Look at the current fuss 
over the search for room-temperature supercon- 
ductors, for instance. 

Researcher One: | suppose I’ve got to say reluc- 
tantly that you're right. But you must agree that 
Robert Millikan never needed an “Evaluation 
Group” to sit over him to decide whether his 
measurements of the charge of the electron 
were worth pursuing. 

Researcher Two: And probably just as well too! 
Gerald Holton’s (Holton, 1978) research on Mil- 
likan’s notebooks shows how Millikan rejected 
lots of measurements with deviant values that 
didn’t fit his preconceptions of what the charge 
should be. If he had had an Evaluation Group 
watching his every move they might have 
noticed that what was to become one of the 
most celebrated experiments in physics was 
actually inconclusive! But joking aside, you 
shouldn’t be comparing clinical budgeting 
experiments with basic science experiments. A 
better comparison is with technologies which 
are being tested in a public context. I don’t know 
much about it, but from my reading of historians 
of technology such as Edward Constant (Con- 
stant, 1980), it seems to be the case that new 
technologies are often tested in a very public 
forum — especially when the public might need 
to be persuaded to take up the technology. The 
trials of the first turbine-driven boat, the Tur- 
bina, were held in public. And if you are really 
interested in pursuing the analogy, Harry Collins 
(Collins, 1988) has recently sent me a preprint 
on a couple of cases involving public testing: one 
was of the transportation ofa nuclear-waste flask 
by train; the CEGB staged a public crash to show 
how safe it was. The other case Harry looked at 
was the testing of an additive to kerosene to stop 
aircraft fires being so devastating. Again, a mock 
crash was staged. Both are cases of tests where 
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there were technical experts and the media pre- 
sent to evaluate the results. 

Researcher One: Are you saying that these clini- 
cal budgeting experiments are more like the 
testing of a technology than scientific 
experimentation? 

Researcher Two: Well, in a way. You can argue, 
for example, that health economics is a social 
technology. Clinical budgeting involves an 
attempt to change human behaviour using the 
principles of economics and, as a system, 
includes material artifacts such as computers . 
and software packages. Indeed, there is a lot of 
new work in the sociology of technology which 
argues that all technologies are irretrievably a 
mixture of social, material, economic and politi- 
cal elements — a “seamless web” is how it is 
described in the book by Bijker et al., (1987). 
Researcher One: I like the idea of health econ- 
omics as a technology. It means that we can pre- 
sent this material to the sociology of technology 
people afid get an additional audience for our 
research. But going back to the Evaluation 
Group for a moment, it does seem to be different 
from those very public tests you mentioned. For 
one thing it was all kept under wraps by the 
DHSS, who appointed the Group in the first 
place and, it was, of course, run by their own 
chief scientist. The Evaluation Group in this case 
seems to have acted as a buffer between the ex- 
periment and the wider public and policy con- 
texts. If you recall it was the recommendation of 
the Evaluation Group which was cited by Wick- 
ings & Coles (1985, p. 7) in their article rather 
than the results reported here (pointing to re- 
port on table). 

Researcher Two: The Evaluation Group can thus 
be seen as a neat way of giving an authoritative 
public interpretation of the experiments with- 
out having to address the messy and potentially 
defeasible process of the research itself. And, of 
course, the fact that the Group is formally inde- 
pendent from the experimenters gives it even 
more authority — which is why Wickings & 
Coles cite the report of the Evaluation Group 
rather than their own findings. As I’m sure you 
are both aware, this is yet another instance of the 
well-established finding of the sociology of 
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science that distance lends enchantment to sci- 
entific certainty. The further you are away from 
the messy details of laboratory work, the more 
certain the results appear to be. The Evaluation 
Group in this case was able to transform the 
messy reality of experimental activity into a firm 
policy edict. 

Researcher Three: Clearly we will have to study 
this Group further. But first I for one want to 
learn more about the experiments themselves; 
were they really that messy? 

Researcher One: \t was more than just a mess, it 
was a disaster. The most interesting thing about 
the report is that as I read it I became increas- 
ingly puzzled as to how the research could ever 
be seen as a success. 

Researcher Three: it ran into difficulties then? 
Researcher One: You can say that again But 
before getting on to what those difficulties con- 
sisted of, let me tell you a little bit about how 
they planned to test the clinical budgeting sys- 
tem. 

Researcher Three (looking at report): ï can see 
that it’s full of these cursed acronyms. What on 
earth are CATs and DMTs? 

Researcher One: CATs, or Clinically Accountable 
Teams, are the new formations in which clini- 
cians are supposed to work. CATs have planned 
budgets which have previously been negotiated 
with the DMTs, the District Management Teams. 
The planning agreements with the DMTs were 
known as PACTs, which are Planning Agree- 
ments with Clinical Teams. All pretty straightfor- 
ward isn’t it? PACTs are the main feature of this 
type of clinical budgeting. A PACT is established 
each year which would set the budget for that 
year and outline the various clinical develop- 
ments that were planned. As part of the project, 
CATs would be provided with extensive infor- 
mation as to what their various costs were. It 
says here that the CATs “were to-be afforded 
major opportunities to redeploy resources 
within their budgetary limits” (Wickings et al., 
1985, p. 17). This is the basic economic 
rationale designed to change clinicians’ 
behaviour which I outlined earlier. If the clini- 
cian has the responsibility for the budget he or 
she will spend the money in a more economi- 
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cally efficient way. There is much debate 
amongst health economists over the most effec- 
tive form of incentive for clinicians, and in this 
form of clinical budgeting virement is the main 
incentive. 

Researcher Two: This is the “strong programme” 
of clinical budgeting? 

Researcher One: Exactly. They then selected 
three particular districts of the NHS — Oldham, 
Southend and East Birmingham —- in which to 
run the experiments. Similar districts were 
selected as controls. 

Researcher Three: That seems fairly clear. So 
what does the report say? 

Researcher One: Well, after setting out the aims 
of the research the report profiles the three 
different districts and outlines how the research 
was implemented in each of them. Great atten- 
tion is given to what is called the “Organisational 
Environment”. The reason for this is spelt out 
later where it says, “During the five year period 
of the research a large number of fundamental 
changes occurred in the orientation of the NHS. 
In combination with the more usual factors such 
as staff changes and selective industrial action — 
they provided an environment within which the 
research took place and against which the 
results should be evaluated” (Wickings et al., 
1985, p. 18). 

Researcher Three: it sounds as if they are hinting 
at problems to come. 

Researcher One: That’s right. And the first prob- 
lem is a pretty damning one. Look, this is the 
chapter in which the results are given. They are 
prefaced by the statement that there will be no 
results from East Birmingham at all! This was 
apparently because the project had to be aban- 
doned in that district before any discussions 
with clinicians were held. 

Researcher Two: That looks to me like a pretty 
straightforward failure. 

Researcher One But the question is what counts 
as success or failure? Since the project at Bir- 
mingham never got started it could be treated as 
not properly a part of the experiment at all and 
therefore neither a failure nor a success. 
Researcher Two: That sounds like gross ad 
bocery to me. 
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Researcher Three: Surely some sort of reason is 
advanced as to why that part of the experiment 
was abandoned? 

Researcher One: It says here, “There is almost no 
reference to the project in East Birmingham 
because (a) a separate report for the Evaluation 
Group has already been prepared and (b) the ex- 
periment was abandoned. ... The abandonment 
was a decision taken by CASPE Research because 
the unit Director judged that the East Birming- 
ham DMT was insufficiently committed to 
implementation within a reasonable time-scale. 
It should be noted, however, that the second of 
the recent NHS reorganisations was in progress 
at the time and this placed great difficulties upon 
the Districts concerned” (Wickings et al, 1985, 
p. 52). 

Researcher Two: That sounds to me like a classic 
way of handling a negative result. It is a point 
which sociological studies of the natural 
sciences have repeatedly revealed. Since every 
experiment involves a whole host of back- 
ground assumptions — ceteris paribus type 
clauses — the significance of any experimental 
result is in principle questionable. It can always 
be argued that some factor from the environ- 
ment or some background theory was responsi- 
ble for the negative result. 

Researcher One: Right, that’s the classic Duhem— 
Quine thesis. It’s like what we used to do in our 
studies of physics when we showed how scien- 
tists actively negotiate what counts as back- 
ground and what counts as foreground during 
the course of an experimental controversy. An 
experimenter claiming some new phenomenon 
of the natural world may face hostile critics who 
argue that some uncontrolled background effect 
is really responsible for the results. A good 
experimenter tries to rule out such potential 
grounds for criticism by producing as “closed” 
an experiment as possible. A successful critic is 
one who manages to open up the experiment to 
the environment. 

Researcher Two: But in the clinical budgeting 
case it is the experimenters themselves who are 
citing environmental factors — such as the NHS 
reorganization — to explain why a negative 
result is not actually a disconfirmation of the 
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phenomenon. 

Researcher Three: Aren’t you two building a lot 
on what is after all only one small aspect of the 
report? 

Researcher One: Oh, but it goes beyond just the 
Birmingham case. The results chapter as a whole 
is full of similar moves to accommodate negative 
results. For instance, the authors continually 
draw attention to the adverse environment they 
faced during the course of the experiment. It re- 
fers here to the “worst of all environments in 
which to test” (Wickings et al, 1985, p. 53) and 
it goes on to a list a number of organizational 
changes which took place at the time. But what 
I found to be so amazing about the report — and 
this is the real gen — was the section on the 
quantitative data. That is the real test of all this 
economic theorizing. If clinical budgeting was 
to have any effect then it should show up by 
changes in the resources used by clinicians. But 
as I read through the lists of all the quantitative 
measures examined I found that there was nota 
single number which could be said to show un- 
ambiguously that clinical budgeting was having 
an effect. The best that could be said was that the 
data were inconclusive. 

Researcher Two: That ts pretty amazing given 
that the whole point of clinical budgeting is to 
bring about changes in how clinicians use 
resources. 

Researcher Three: | would like to know a little bit 
more about these quantitative measures. 
Researcher One: Okay. The first thing they 
looked at were changes in non-staff clinically 
related items such as drugs, X-ray consumables 
and the purchase of medical equipment. They 
compared the costs in the experimental districts 
with the region as a whole. They found it a dif- 
ficult exercise to do and were apparently unable 
to come to any clear conclusions (reading): “. .. 
perhaps the only firm conclusion that can be 
drawn is that it is impossible to make any such 
conclusions from this type of data” (Wickings et 
al, 1985, p. 86). They then looked for changes in 
resource use brought about by the specific 
PACT agreements. There are pages and pages of 
figures but again their conclusion was, “In sum- 
mary the figures do not conclusively 
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demonstrate cither a better or worse use of 
resources” (Wickings et al, 1985, p. 91). No 
firm conclusions could be drawn from data on 
patient management related costs or on case 
mixes either. 
Researcher Two: 1 don’t believe this. There must 
have been some positive results to report. Surely 
the Griffiths team can’t have got it totally wrong. 
Researcher One: Well, there is one positive 
result. Let me read you this: “Although the analy- 
sis earlier in this chapter suggests that little hap- 
pened which apparently changed the overall 
performance of the districts, when measured in 
terms of overall throughput or relative expendi- 
ture on particular headings, it is fair to point out 
that the PACT discussions between clinicians 
and members of the District Management Team 
were found to be worthwhile on a number of 
counts and that during these meetings a number 
of important planning issues were raised” (Wick- 
ings et al, 1985, p. 110). Basically they got on 
better! (Laugbter.) 
Researcher Two: That’s really ironic. Their one 
success is in an area which seems to have little to 
do with economics. But this is all very puzzling; 
how on earth could these experiments be 
regarded in any way as a success? 
Researcher One: That was exactly what I was try- 
ing to understand by the time I got to the section 
on “Lessons Learnt”. Indeed, it seems that the au- 
thors of the report themselves realized that they 
faced something of a problem. They wrote at the 
start of this section: “It sounds perverse, and may 
indeed be so, to regard the experiments 
- reported here as encouraging rather than disap- 
pointing” (Wickings et ak, 1985, p. 133). There 
follows a list of the “encouraging” points. I must 
admit I chuckled reading this list. It goes as fol- 
lows: “(i) The management teams in both Old- 
ham and Southend have continued to invest in 
staff to support the system. (ii) Much technolog- 
ical development occurred which has since 
been adopted by the Management Budgeting 
demonstration districts.” Those, by the way, are 
the ones set up after the Griffiths report. “Ciii) 
Some (although the minority) of consultants 
liked and used the available systems and a num- 
ber of beneficial changes were made.” These are 
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the changes such as the talking together which I 
referred to earlier. (Laughter from Researchers 
Two and Three.) “(iv) The ward sisters in South- 
end enjoyed being budget holders.” (More 
laughter from Researcher Two.) | thought you'd 
like that one. Here is another important finding: 
“(v) Mr Jim Blyth, of the Griffiths Inquiry team, 
was sufficiently impressed to advocate what he 
called “Management Budgeting” after his visit to 
Southend and the systems have substantial 
similarities.” You see they are alike after all! Now 
comes the finding to which they attach the most 
importance: “(vi) Perhaps of more significance, 
the national Evaluation Group were supportive 
in their interim report (April 1985).” If you re- 
call, that is the positive report which Wickings & 
Coles quote in their article. Finally they say: 
“(vii) Although there are only limited signs of 
“success” there have been even fewer sugges- 
tions that the overall thrust was wrong” (Wick- 
ings & Cole, 1985, pp. 133—134). 

Researcher Two (disbelieving): Is that it? 
Researcher One: Yes that’s it. After five years of 
experimentation that is what they found. 
Researcher Two: Well, one thing I’ve learnt from 
this project is that in comparison the para- 
psychology experiments looked at by Collins & 
Pinch (1982) seem like Nobel Prize candidates! 
Researcher One: I told you it was dynamite. The 
most interesting thing about this list, apart from 
its meagre nature in contrast with the original 
objectives, is that most of the positive reasons 
given are on the lines of saying it is a success 
because other people like it, or even, in the case 
of the Evaluation Group, because other people 
think that it’s a success! That really does seem to 
put the cart before the horse. If there is no evi- 
dence that the thing works in the first place, you 
could argue that the more people that come to 
believe in it, the bigger is the failure. 
Researcher Three: What was the exact statement 
which the Evaluation Group made in support? 
Researcher One: Well, there is a lot of hedging 
around describing the experiments and so forth, 
but the key part is the last paragraph where it 
says, “Despite the major difficulties encountered 
in the research districts, the Evaluation Group is 
unanimously of the view that in principle this 
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PACTs centred budgeting system has all the 
right ingredients for improved resource man- 
agement in the NHS, and it should be given the 
support needed to ensure its wider dissemina- 
tion within the service” (quoted in Wickings et 
al, 1985, p. 7). 

Researcher Two; That sounds very positive to 
me. The “in principle” caveat is a nice way of put- 
ting it. Of course, in principle, the system is the 
right one even if it doesn’t work in practice. It is 
the classic way to save the phenomenon. If your 
experiment has been refuted you say that it 
wasn’t actually a proper test and therefore the 
negative evidence doesn’t count for anything. 
This puts a nice slant on an argument I recently 
read in a paper by MacKenzie (MacKenzie, 
1988) on the testing of ballistic missile technol- 
ogy. MacKenzie points out that tests of strategic 
ballistic missiles off the coast of the United States 
can be challenged by saying that the results 
obtained there may not be applicable when the 
weapons are used in a real nuclear war. This 
argument was, for instance, made for a time by 
the U.S. manned-bomber lobby. You challenge 
the positive results by pointing to a difference 
between the context of testing and the context 
ofuse. For those who wish to generalize from the 
tests, the context of test and and context of use 
are held to be similar. But for the critics the dif- 
ferences are significant. 

Researcher One: Back to similarity and differ- 
ence judgements again? 

Researcher Two; Exactly. The connection 
between testing and use can be said to be a mat- 
ter of social negotiation. Our own case exhibits 
the same phenomenon but in a different way. 
Here we have a negative rather than positive test 
result being discounted because it is claimed 
that the context of testing was in someway 
special because of major reorganizations of the 
NHS which took place during the test. In this 
case too it is claimed that the context of the test 
does not match the context of use. The Evalua- 
tion Group argue that under normal conditions 
of use there is every reason to believe that clini- 
cal budgeting will work properly. In short, the 
similarity and difference between context of use 
and context of test is again seen to be a flexible 
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resource for argument. 

Researcher Three: I think that what the two of 
you are saying is interesting but I am worried 
about one thing. In order to make this kind of 
argument at all, you have had to set up a defini- 
tive version of clinical budgeting — the version 
which says that it is an attempt to change econ- 
omic behaviour and that whether or not it suc- 
ceeds in doing so can be discovered by “direct 
observation” in these experiments. You then 
deconstruct the Evaluation Group’s “success” 
claim by contrasting it with a “failure” version 
which you derive from this report. But isn’t 
there another way of looking at it? Suppose there 
is more than one version of clinical budgeting 
available. I seem to remember that one of you 
not so long ago was arguing this very case. Sup- 
pose we take the version of clinical budgeting 
which Kathleen presented to the clinicians 
which suggests that it is a modest attempt to 
offer practical help and that success is to be 
defined in terms of getting the thing working to 
however limited an extent. You wouldn’t, of 
course, expect such marginal benefits to show 
up in the quantitative data. In terms of this ver- 
sion, rather than being a failure you can start to 
see how the experiments might be seen as a suc- 
cess, especially given the hostile environment at 
the time. The fact that people such as ward sis- 
ters liked the system — which you sneered at in 
that rather sexist way earlier — is actually as 
good a measure of success as anything else. The 
fact that it is taken up by practitioners is surely in 
the end the best criterion of success? 
Researcher One: I see what you are saying, but 
my point is that it was the participants them- 
selves, that is, Wickings et al (1985) who made 
the appeal to scientific rhetoric. And they them- 
selves acknowledged that the experiments were 
less than successful; remember Wickings said, 
quote, “It sounds perverse... to regard the 
experiments reported here as encouraging 
rather than disappointing.” 

Researcher Three: Well, we will clearly have to 
talk to Wickings himself to get his view. It could 
be the case that these different rhetorics are con- 
tinually drawn upon for different purposes. Even 
arguing that clinical budgeting is a technology 
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seems to involve one particular version. 
Researcher Two: I'm glad the stuff on technology 
was useful even if it did only highlight how we as 
analysts are using different versions of health 
economics. But [ve got to get back to my video 
analysis. It’s time for some real research. 
Researcher One: You're forgetting. We’ve got to 
arrange some more interviews. We need to talk 
with Wickings and also somebody who was a 
member of this Evaluation Group. 

Researcher Three (getting up to go): Well, ul 
leave you both to do that. And if you find time, 
perhaps you could read through these two chap- 
ters of our book on health economics which I’ve 
just finished. (Handing over typescript.) lve got 
to get back to some administration — another of 
these damn surveys evaluating the department 
has arrived. 


ACT IV: WICKINGS’ WORLD 


The location is a large office resplendent with 
Edwardian furniture at the King’s Fund Trust in 
London. Iden Wickings is seated behind a large 
desk. Researchers One and Two are seated in 
easy chairs in front of the desk. On the desk is a 
small tape-recorder.* 


Tape-recorder (off scene): Td really like to pro- 
test, all these parts played below should actually 
be played by me, after all I recorded the whole 
thing. 

Researcher One (ignoring tape-recorder): I 
wonder if you could just say a little bit about the 
history of your involvement with clinical 
budgeting. 

Wickings: Okay, we’ve done a series of projects. 
The Westminster one was the first that I know of 
in which we did an experiment... and it cer- 
tainly seemed to demonstrate some change. We 
could go into it if you’re interested . . . [Then at 
Brent] we tried to achieve the same changes just 
using costing data — I don’t know how familiar 
you are with the distinctions and such like — we 
reported the cost to peer groups with various 





T. PINCH, M. MULKAY and M. ASHMORE 


hypotheses about the high cost group... and 
saw nothing for three years, despite everybody 
saying how valuable and important the informa- 
tion was. And so we then went into Southend 
and Oldham and another district... 

Researcher Two: East Birmingham? 

Wickings: East Birmingham. And there we were 
trying to see whether one could get the same 
results using only the variable costs and exclud- 
ing staffing and capital costs.... I get a bit irri- 
tated, people say, you know, “You’ve been doing 
this for ten years and what have you shown?” But 
in fact each time we’ve been trying a different 
approach and we believe that we’ve gradually 
learnt the conditions under which it’s likely to 
be successful. We would no longer recommend 
it for universal adoption, certainly until some 
successful projects have run for a while, which 
hasn’t happened yet. ... 

Researcher Two: \s there a distinction between 
clinical budgeting and management budgeting? 
Wickings: Well there is yes, but it's-a bit arcane, 
I mean I don’t know how much detail you want 
to go into, but management budgeting — which 
is now called resource management — is con- 
cerned with the technique of distributing costs 
to managers so that they can control them ... 
The differences are not all that clear. [In manage- 
ment budgeting] they also believed very much 
in charging out overheads. We did not in the pro- 
jects that we’ve been engaged in, because we felt 
that it was important to emphasize those things 
the clinician could influence himself, you 
know.... 

Researcher One: I read the Nuffield Portfolio, 
there is an introduction from Tony Culyer and 
he describes these clinical budgeting trials, or 
whatever, as experiments, I mean did you see it 
yourself as an experiment in that sense? 
Wickings: Yes, I mean; yes we tried, in so far as 
we could, to set it up so that you would get 
genuine learning, so I suppose, I don’t like the 
phrase “experimenting”, but yes, I don’t mind, I 
suppose . .. we worked — sorry I’m sort of stam- 
mering really — we certainly saw it as being 
innovative, and therefore worthwhile if you 


*An interview with Iden Wickings was conducted on 2 February 1987. 
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were going to learn from it, of trying to establish 
some reasonable sorts of controls, you know and 
such like, it makes it more complicated and such 
like. And because of the difficulty of learning 
from these things and forming balanced judge- 
ments, there were various ways you ought to 
evaluate the project. 

Researcher One: I noticed there was an Evalua- 
tion Group set up... f 

Wickings: That’s right and they think it’s the best 
thing since fish-fingers more or less, they were 
very supportive. But you see that’s an example 
that one of the difficulties we felt . . . was of defin- 
ing what success is. 

Researcher Two: Returning to the point about 
the experiment nature of it. I mean I got the 
impression when I read the beginning of this re- 
port (Wickings etal., 1985) these were being set 
up as kind of like tests of the idea, and something 
riding on these particular events... 

Wickings: I think that’s probably right, I mean 
there’s a limited number of occasions on which 
you'll get governmental money to try things 
out... 

Researcher Two (laughing): Sure. 

Wickings (laughing): Precisely. Particularly if 
they’re expensive as in many senses this was. I 
still think actually that it’s a piddling little invest- 
ment compared with the importance of trying to 
be able to get a negotiated set of expectations of 
what each expect of the other, from general 
managers and clinicians, but obviously you have 
to be very discreet. We thought it would be 
easier than it was. There were terrible difficul- 
ties due to the repeated reorganizations of the 
health service. I mean that genuinely, it meant 
that people were coming and going and that 
people were without staff and so on and so forth. 

Researcher One: This affected the Birmingham 
part of the study didn’t it? 

Wickings: It affected all of them. Birmingham 
was part of it, but it also affected Southend. At 
one stage, only the district administrator had 
been working in Southend out of any of the man- 
agers for longer than a year ... and to expect 
them to be introducing new ways of working 
immediately during that period was very dif- 
ficult. And there were three successive changes 
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in Oldham, but I mean that’s what the world’s 


‘like I'm afraid, but it makes it very difficult; I 


often wished I was injecting rats in cages. 
Researcher One: Could you make an argument 
that in a sense for a successful clinical budgeting 
system to work, you must be able to deal with 
those sort of... 

Wickings: Yes, you're right ... the period that 
you don’t want people to go is when you're set- 
ting them up. You know, we never got to any sort 
of stage ... And I think we’re going to find the 
same troubles with the resource management 
system; that it’s very difficult to set it up. I think, 
going back to the point I was making, a lot 
depends on whether you have managers of the 
capacity to cope with it, who want to go on 
doing it... 

Researcher One: Your various economic criteria, 
there wasn’t actually much change as I under- 
Stand itin... 

Wickings: Well, I don’t regard those as a success, 
I don’t think they demonstrated very much, 
except that in the circumstances in which we 
tried it, it didn’t work. 

Researcher One: Yeah, and you put something, I 
quote from the report: “It sounds perverse to 
regard the experiments as encouraging rather 
than disappointing.” 

Wickings: Yes. 

Researcher One: So you regard the experiment, 
that those ones are largely a failure then? 
Wickings: Well, I don’t like these words “failure” 
and “success”. You know how these things work 
don’t you? 

Researcher One: Yes. 

Wickings: What I meant, the things I felt that we 
could really be encouraged by, were that it was 
never rejected, firstly. We can try and start ina 
new.system, and if you’ve worked with doctors 
very much — have you worked with clinicians in 
hospitals? 

Researcher One: No. 

Wickings: Well, they're an extremely powerful 
and rightly I think arrogant lot by and large, very 
independent, idiosyncratic, they’re not managed 
by anybody you see, they’re like professors. And, 
that to get them to accept changes of this sort is 
always difficult. Now, by and large, the medical 
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staff were either apathetic, or supportive, there 
were one or two fierce opponents, but they 
were unusual ... Where I felt the encourage- 
ment came was that if the managers had been 
able to do their bit, I think that the evidence was 
still encouraging — but obviously you can’t be 
too unkind about people who’ve been very kind 
and worked for you over several years, I mean 
it’s very good of them to have done it at all... For 
example, East Birmingham — the medical staff 
there, I hadn’t, we had no problem with them at 
all, they were very keen. But I mean I stopped 
working there because the bloody DMT [District 
Management Team] weren't putting any effort 
into it and you can’t go on pouring money into 


these things unless people are putting some. 


effort into it. I mean they were all different, you 
see one of the troubles, one was trying to do so 
many things at one time ... we were testing out 
something to see whether it could be a national 
model — that was the idea. And our conclusions 
on that were, that you don’t stand a hope in hell, 
of doing it, if you haven’t got at least some good 
models demonstrated by test pilots, that was the 
analogy we used. 

Researcher One: Can I just go back to something 
in the report, the interim evaluation report 
which is attached to that appendix. Would you 
say it’s fairly positive? 

Wickings: Yes, I mean I was, I was pleased. I 
think very correctly (laughter), I mean we felt 
that it was our job to present the results of our 
evaluation, and we couldn’t ourselves claim that 
there was much evidence. Nonetheless, it’s still 
our view that the potential of the system has not 
been invalidated at all and that’s what fortu- 
nately this Evaluation Group — they were quite 
tough with us at times — but they helped to say 
that there wasn’t a better way forward in their 
view; that there is always going to be resource 
shortage, you have to find ways of handling that. 
That does in some way or another require some 
dealings between clinicians and managers and 
although you haven’t done it very well yet, that’s 
the way forward, to do something like that. 
Researcher Two: So in effect it wouldn’t have 
mattered what the actual results were in the 
experimental districts? 
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Wickings: Yes it would, yes I think it would, I 
mean if there bad been absolute confusion, I 
mean goodness knows what they would have 
said. But they didn’t say — which was an expec- 
tation when we did our research originally — 
“this should now be implemented nationally” 
(bangs desk). There was nothing like that. And 
what they really sanctioned was continued work 
to try and get it to work. And the resource man- 
agement initiative is the same . . . and they’re try- 
ing six districts with again different patterns, 
because it’s bloody difficult. 

Researcher One: That came through the Griffiths 
Report? The resource management? 

Wickings: Yes, that’s right? 

Researcher One: Because that’s one of the suc- 
cesses, Jim Blyth with the Griffiths commission. 
Wickings: Well, Jim Blyth’ went down to see 
Southend ... but you see again, I think he under- 
estimated the complexity of it, because the Grif- 
fiths report said they had set these up to be im- 
plemented in six months. Well two years later 
they were just at the same stage we had been. It’s 
much more complex than people seem to 
understand... 

Researcher One: Why do you think people have 
underestimated the complexity of introducing 
this? 

Wickings: It’s partly the boredom factor. I mean, 
the health service is greedy for new ideas. And a 
new idea comes along and everybody says, “tre- 
mendous!” And for a while it’s terribly fashion- 
able, “this is it!” and so on and everybody 
assumes it’s going to be working whisky-a-gogo 
in no time... And then the actual business of set- 
ting that up and running it and bringing about so- 
cial change in a very complex organization, 
everybody’s found that — not just us, I mean you 
look at the literature about the introduction of 
social change in industry — to take a good many 
years to bring about a very complex change is as 
nothing. And from the outside, I mean I could 
have said to British Leyland: all they’ve got to do 
is have fewer people and produce more cars, and 
they’d be wonderful! That’s what people think, 
but as you know when you try and do that 
within the organization, it’s very difficult. 
Researcher One: It’s interesting the role of the 


CLINICAL BUDGETING: EXPERIMENTATION 


researcher in this sort of dichotomy, because in 
a sense we researchers are always trying to stress 
how complicated it is and it’s difficult to get hard 
and fast findings. Well in a sense the policy 
people say, “well that’s no use, no, we want to 
make these new policies”. Did you find yourself 
caught in that sort of dichotomy? 

Wickings: Well we were under a sort of pressure 
to produce results. I mean there wasn’t any very 
great support for what you might call the 
delights of academic learning. They wanted 
results. But I can certainly understand it. I think 
we all of course tend to see the things we’re en- 
gaged in as more complex than others will see it 
unless they’re involved in it. I don’t know, I feel 
that the fact we were trying to change the model 
we were testing each time and that we were try- 
ing to bring about a very significant change in 
complex organizations at a time when the chief 
officers were regularly changing ... I felt one 
shouldn’t be discouraged by what one had seen, 
because there hadn’t been any evidence to per- 
suade one that it was wrong; that is was difficult, 
yes, 

Researcher Two: But isn’t it quite likely that 
given all this kind of turbulence, all these organi- 
zational changes being forced upon the health 
service, that people working within it and maybe 
especially consultants, would see the clinical 
budgeting effort, the PACTs experiments, as just 
another one of those? You know, another one in 
the train of... 

Wickings: Yes. Thats right. There’s lots of 
people thinking that at the moment. 

Researcher Two; | mean aren’t they in a sense 
right, that it’s just another one of those? 
Wickings: Well yes, of course, it’s just another 
one of those. It’s not a sort of Holy Grail or any- 
thing like that. The health service has not been 
transformed by it. I mean it is now routinely 
accepted by most clinicians that they are going 
to have to work within a budget at some stage... 
If you talk to the managers up and down the dis- 


trict they also feel that, that if the cash limited. 


state funded system is going to be managed at all, 
it must be able to make choices: a bit more on 
this and less on that and some sort of budgeting 
system is necessary for that. And I think, myself, 
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‘that we were largely accountable for that 


change. The fact we haven't, I’m afraid, been able 
to show it all working whisky-a-gogo is a pity, I 
wish we could, I'd be Sir Iden Wickings or some- 
thing now. (Laughter. ) But that’s how real life is, 
isn’t it? 


At this point coffee is brought in and the tape 
recorder is switched off. Wickings asks the 
researchers whether this is the sort of thing they 
want from him because he has found the line of 
questioning rather unexpected. After reassur- 
ance from the researchers that the interview is 
working very well, Wickings goes on to outline 
the different forms of management budgeting 
and resource management which have been 
developed post-Griffiths. The interview re- 
sumes: 


Researcher One: How do you judge success or 
failure in these resource management projects? 

Wickings: I think. thats very difficult ... The 
sorts of things that I would regard as being evi- 
dence of success, are that the people locally 
claim that they’re able to be better social actors 
now than they were before and can produce 
some evidence to support that. Now, it would be 
very nice to be able to say that you know the, 
mortality rates have dropped or something, but 
the likelihood of that being shown is so slight 
and one just has to accept that that’s not there. 
What I would call constructive redeployment of 
funds is some evidence of it being useful I think, 
particularly if people say, “we can now do this 
and we could do that and we couldn’t have done 
this without that system.” And the sort of picture 
I have in my mind’s eye, of success, would be 
there being a set of overall plans . . . a set of plans 
for each specialty in which the managers and the 
clinicians both feel they’re working to achieve 
the same things. And success will be that they 
both know what successful is like, you know and 
that seems to me to be an important sort of social 
goal of these efforts. 

Researcher One: It’s interesting, in that CASPE 
report, some of the criteria for judging the suc- 


294 


cess early on were listed as being economic, I 
mean they’re like economic measures and your 
most recent one is very much; you said “social 
actors”... my eyes sort of lit up... 

Wickings: I think ... (Laughs.) Yeah, there isn’t 
a monocular view of the world is there? I mean if 
you have got one, you’re more a fool I think and 
one should if possible try and get a sort of vecto- 
rial approach of various views. At least I think 
that. But I mean I quite often have economists 
on, working in CASPE ... I think economists are 
— Pve just reviewed a book by Gavin Mooney — 
I think economists have a very mechanistic view 
of life; a view that people are logical. And the 
expectations they have are based upon various 
logical hypotheses about reactions to different 
incentives and so on and so forth. Now, I don’t 
think obviously they do have to accept that, in 
fact people are far less logical than that and make 
the most bizarre choices and often hold con- 
tradictory views. When you actually work with 
them you find that they're in some cognitive dis- 
sonance way having difficulty’ with trying to 
reconcile the greatly conflicting views... I mean 
economics is a very helpful way to analyze trans- 
actions between people, I’m very persuaded by 
it. I’m quite glad I didn’t do a first degree in econ- 
omics mind you, because people who work with 
me who did that, they often seem to me to be 
irritated that people don’t actually seem to work 
the way that they bloody well should, you know 
... Pm not sure whether the changes are gross 
~ enough to be seen in these big studies and that’s 
one of the problems about it. And that’s why it 
may be that you need to do more studies about 
what’s happening at the micro level within the 
organizations and see if, if you believe that’s ... 
Researcher Two: Yow ll need sociologists for 
that. (Laughs. ) 

Wickings: Yes there’s no choice in that. 


Researcher Two: Okay. Suppose it was instituted . 


as a national policy throughout, throughout the 
NHS... 

Wickings: Which I actually think it will, oddly 
enough, despite all the difficulties, I think it will, 
because I think it’s logical (laughs); that sounds 
illogical. 

Researcher Two: Right okay, well if it was... 
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Wickings: But not before 1995 I would say so. 
Researcher Two: Okay, so it’s a very long term... 
Wickings: I think it will gradually come yes. 
Sorry, I keep interrupting. 

Researcher Two: Yes, you do it quite success- 
fully, P've forgotten what I was going say. 
(Laughs.) This always happens. 

Wickings: Yes, clinical budgeting. 

Researcher Two: Yes thank you, yes, that’s what 
we were talking about. Yes, so how would you 
envisage the difference in the health service, 
from a patient’s point of view? I mean would 
there be fewer waiting lists for instance? 
Wickings: I passionately believe that we ought 
to be giving our patients a better deal than we 
now are. Often they’re getting a very good deal, 
but there are many times they're not and we 
don’t seem to have the mechanisms to handle 
that. And I know I’m talking long term and I’m 
talking on the assumption that one has got some- 
thing, something like a national system, so rm 
making buge gigantic leaps you realize that. But 
if you imagine that most districts had something 
like this and that the information was shared, not 
only would you be able to have people like me 
say, “Well this is what I would expect us to see. 
Why are we only seeing that?” As new consul- 
tants appeared, you’d be able to spell out what 
you thought they were going to do... 
Researcher One: You said during the coffee 
break that you found our questions quite extra- 
ordinary. Why, by the way? 

Wickings: I don’t know really. I hadn’t really 
expected that the discussion would go this sort 
of way. And also I’m wondering if you’re going 
around a whole series of projects like this, how 
you’re going to draw common things together 
from them. I find that interesting. Will we be able 
in the end to see something written? 
Researcher One: Oh, I hope so. 

Researcher Two: We've already written three 
Researcher One: Some of our stuff, we've already 
presented it to the Health Economists’ Study ` 
Group. 

Wickings: Yes, unfortunately, I’m not a member 
of the HESG because I'm not a health economist 
you see... 
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ACT V: AN EVALUATION MEETING 


The research on clinical budgeting has at long 
last come to an end. Researcher One has man- 
aged to interview a member of the DHSS Evalua- 
tion Group — the first member of the group he 
had approached had refused to talk on the 
record. The paper for Brunel University has also 
been delivered. The three researchers are now 
sitting in Researcher One's office at the Univer- 
sity of York. They are talking over what they 
have achieved in the research. 


Researcher Two: My feeling is the Wickings 
interview went quite well. He gave us a lot of his 
time and seemed to talk freely. He definitely pro- 
duced a “weak programme” version of clinical 
budgeting. He said that he didn’t like us referring 
to his research projects as “experiments” and he 
pointed out that it was very hard to say what suc- 
cess or failure meant — he preferred to talk 
about it as a “learning process”. The role of the 
PACTs agreements seems to have been to pro- 
vide a “negotiating framework” to get more 
explicit discussion between clinicians and man- 
agers of their future plans. The direct route to 
change on economic grounds advocated in his 
article in the Nuffteld/York Portfolio — change 
measurable by “direct observation” in “experi- 
-ments” —— was replaced by a rather different con- 
ception. This is not to say that Wickings implied 
that PACTs were totally ineffective, but rather 
that what impact they had was not produced by 
doctors, managers and nurses simply acting as 
individual economic calculators. At one point I 
even. heard shim hinting that the underlying 
problems of the NHS were sociological rather 
than economic. 

Researcher One: Yes, I thought he was going to 
offer us both jobs when he said that! Great! I 
thought, at last a bit of consultancy. Seriously 
though, the experimental rhetoric did rather 
seem to vanish when we got talking. In the end 
his research turned out not be be “science” in 
the sense of physics, but neither did it seem a 
piece of “quick and dirty” policy research. 
Remember how he himself referred to what he 
was doing. He said he had been “doing this for 
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ten years”, “trying a different approach [each 
time] to try and see”; he described it as “a learn- 
ing process”, “to try and see the conditions 
under which it’s likely to be successful”. The 
science-like aspects were reduced to “trying to 
establish some reasonable sorts of controls” thus 
“making it more complicated”, and he talked 
much more about evaluating it in terms of “ba- 
lanced judgements” and “things we could really 
be encouraged by”. 

Researcher Three: The language of hypothesis- 
testing and experimentation, of cut and dried 


_ success and failure, certainly doesn’t seem to do ` 


such research justice. Maybe dealing with the 
“real world” requires the sort of research where 
you learn slowly over a long period of time by 
trial and error. 

Researcher Two: Quite possibly. But the scien- 
tific rhetoric which was dominant in the CASPE 
report made its appearance in the interview 
nevertheless; though admittedly usually in res- 
ponse to our formulations. But he quickly 
adopted the alternative way of talking and 
indeed seemed rather unsure of himself when 
we asked him outright whether he regarded the 
thing as an experiment. 

Researcher One: 1 know, but it only makes me 
depressed. I started off this project thinking we'd 
at last found some real science — “forget your 
QALYs”, I thought, “it may not be physics, but at 
least they have experiments” — and it turns out 
that it dissolves into something which as far as I 
can see is not too dissimilar to sociology. 
Researcher Two: And it’s all the better for that 
too. I can’t see why you’re depressed at finding 
that out. I’m more reassured. And physics is not 
so different either — at least according to mod- 
ern sociology of science. 

Researcher Three: And if you weren't so obsessed 
with physics you would see that the sort of “par- 
ticipant centred” sensitive research which Wick- 
ings has evolved is probably the best that you 
can do in a policy context. You’ve got to admire 
him for having stuck with it for so long. 
Researcher One: If you insist. But then we still 
have the problem that he isn’t a proper health 
economist. Anyway, moving on, at least I got 
some real dirt from the, member of'the Evalua- 
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tion Group I interviewed.’ It turns out that the 
Group was under direct pressure from the 
government. I don’t think either of you have 
seen the transcript yet. Perhaps I could read 
some of the relevant pieces to you? 

Researcher Two: You managed to use the tape 
recorder this time did you? 

Researcher One: Sort of. I felt there might be 
problems after the first person we approached 
said he didn’t want to talk on the record at all. 
This.time I managed to record half an interview. 
He let me use the recorder alright, but then we 
were interrupted by a phone call and — I’m 
embarrassed to say this —- when I started the re- 
corder again I pushed the wrong button. I’m re- 
ally incompetent I’m afraid. 

Researcher Two; Thats why you never made it as 
a physicist. 

Researcher One: I realized what had happened as 
soon as I had finished and I hastily wrote up 
some notes on the train back. Anyway (shuffling 
through reams of paper on bis desk) here is the 
transcript and my notes (producing several 
sheets of tatty typescript). 1 started by asking 
him why they picked him’ for the Evaluation 
Group. He said he didn’t have a clue. Right at the 
start he stressed the importance of the Evalua- 
tion Group being independent from Wickings’ 
team. He said, “The idea was that we would be 
independent of the groups that were actually 
involved in the process. So we weren't really 
part of the promotion activity. We were inde- 
‘< pendent of it.” That’s interesting because later 
on, as we'll see, he gives reasons as to why in fact 
they weren’t actually independent. He then de- 
scribed the way the Group worked. They went 
out to visit the districts either alone or in pairs 
and tried to talk to everyone including the dis- 
affected people who didn’t think the thing was 
working. He was really quite proud that they had 
tried to find the disaffected people. Now here is 
the best bit, he’s talking about the production of 
the report: “Then we got to a point where they 
decided, they suddenly decided in my view too 
hurriedly that we were to report, a sudden deci- 
sion that we were to report. And I know why that 
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was taken because by that stage the Department, 
the government, had decided that they really 
wanted to move ahead on this and they thought 
all this pithering around, this slow development 
was really a waste of time and energy. We had to 
make a big push and they wanted:to know 


quickly what the lessons were from what we had 
done. They were also very dissatisfied with us as 
an Evaluation Group, obviously, that we were 
also pithering around.” 

Researcher Two: So there was direct govern- 
mental pressure on the Evaluation Group. At 
what level did the pressure come? - 

Researcher One: That’s what I asked him next 
and he said, “Somebody high up the system.” He 
also said that after that they decided to do away 
with Evaluation Groups altogether. Anyway he 
said later that he felt most uncomfortable about 
the whole way of working and that it was this 
government chief scientist, Buller, the head of 
the Evaluation Group who was orchestratin; 
things. a 
Researcher Two: That’s wonderful, a very clear 
example of direct external pressure on research, 
the sort of evidence which it is very difficult to 
get in sociology of science. A hard and fast case - 
of political pressures directly impacting on how 
scientific facts are constructed. 

Researcher One: Yes, I thought you would like 
that. 

Researcher Three: Much as I dislike the present 
government, I feel slightly uneasy at the way you 
two are willing to accept this bit of interview 
data so uncritically. After all it is only one version 
and if we talked with Buller he might very well 
give a different account of what went on. 
Researcher One: Come off it, that’s taking this 
stuff on versions too far. It is just a fancy way of 
talking to avoid biting the real political bullet. It 
may be a version, but it is the one I would be pre- 
pared to put my money on. 

Researcher Two: Yes, and what about your argu- 
ment about the need to produce an appropriate, 
version for the occasion at hand? Surely the rele- ' 
vant occasion in terms of our research has got to 
be the opportunity to deal with the politics of 


“The interview with the evaluator took place on 11 March 1988. 
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the NHS and to criticize Thatcher’s policies. We 
can’t hide behind some sort of bogus neutrality 
by saying it’s all just a matter of versions. 
Researcher Three: Well, I don’t think we can re- 
solve that issue here. Perhaps the final chapter of 
our book Health and Efficiency might be a more 
appropriate place for that. But getting back to 
your respondent — the evaluator — did he disa- 
gree with the conclusion the Evaluation Group 
produced? 

Researcher One: Well, that’s the problem. By and 
large he did agree. 

Researcher Three: So you could argue that in this 
case the governments intervention simply 
speeded up the inevitable. 

Researcher One: You could argue that, but I’m 
not going to. This evaluator has a rather frustrat- 
ing attitude towards the success or failure of the 
clinical budgeting experiments. Like Wickings 
he spent most of his time outlining what was 
wrong and the horrendous problems they faced, 
but despite all that, he concludes that they’re 
still the only way forward. Here, listen to this: “I 
mean they know one or two things from it, for 
example, the system won’t work without a high 
degree of commitment from management that 
they will change their managerial style to make it 
work. I mean I felt that there were far too many 
people who felt that this experiment was a sort 
of substitute for taking a tough management line, 
somehow clinical budgeting was going to solve 
the problem for you; it wasn’t.” 

Researcher Two (interrupting): That’s an 
interesting argument. He seems to be saying that 
the” experiment itself got in the way and 
adversely affected managers’ practices — a kind 
of Heisenberg disturbance of the system with 
the measuring instrument. 

Researcher One: rm pleased to hear that you can 
use a bit of physics too. The point is, he goes on 
to be pretty scathing about the experience of 
PACTs: “. . . the clinicians I mean, some of them, 
they were more sceptical because they thought 
it was all a trick to get money out of them... 
They were right in their suspicion, which I think 
was one of the things that was really bad and the 
management weren’t strong enough to run the 
system ...” But then having said all that he still 
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came out in favour of it: “Well I think we went 
through all the difficulties you see. And then one 
said, “Okay, well what are we saying? Are we say- 
ing it’s all so difficult we should give up? Or are 
the difficulties there because it’s fundamentally 
misconceived, or despite the difficulties we 
should soldier on?” It was at that point that we 
did come to the overall view that it wasn’t mis- 
conceived. I mean it does have the right ingre- 


‘dients, and the question then is, is it worth 


struggling to make sure these ingredients are 
present and used in the right way? ... Generally 
one has to struggle with it. I mean they had three 
goes and three failures but there was no reason 
to stop just yet”. 

Researcher Two: What is the role of the test, 
then, if you go on struggling with it after the 
thing has failed? 

Researcher One: Well, of course, I asked him that 
but he just went back to a priori grounds as to 
why it was so good: “It has the elements in it that 
you think a good management system should 
have. The PACT — this is the central feature — 
the PACT is the focus of negotiation” and so on. 
Researcher Three: So his view of it is rather like 
Wickings’ — it is all about providing a manage- 
ment structure or a context within which 
people can negotiate. 

Researcher One: That’s right. Unfortunately I 
can’t be one hundred per cent sure because the 
tape stopped then. But I did manage to record 
the bit where he had doubts about the indepen- 
dent status of the Evaluation Group: “In a way I 
felt the Evaluation Group got too close to Iden 
Wickings ... We were almost pushed into a role 
of helping him design his system better by feed- 
ing back to him the criticisms that were given. .. 
And I can see that in the interests of health ser- 
vice management that might be a good idea, but 
from the point of view of doing strict evaluation, 
I think we should have been more detached than 
we actually were. I think we got — I won't say 
captured because we are not easy people to cap- 
ture ...” When I suggested that maybe he was 
“sucked” into it he replied: “Well, we got pushed 
into a slightly different role ... We got pushed 
into a role of helping the experiments to work, 
rather than evaluating them as they stood ... 
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Well, that’s fair enough, but I don’t think it’s 
quite what an Evaluation Group should be doing. 
But, on the other hand, J think it was better hav- 
ing us there than what they have done since. 
Which is to say without evaluation ...” 
Researcher Three: So even the evaluators seem to 
have ended up assisting the process of applica- 
tion — a kind of “weak programme of evalua- 
tion”, which contrasts strongly with your earlier 
quotation from this evaluator where he talked 
about the merits of independent evaluation. 
Researcher One: That’s right. So many of the 
interviews oscillate around this double rhetoric, 
where one moment everything is scientifically 
hunky dory, rigorous and independent and the 
next it is all couched in vague phrases like “bet- 
ter having us there than not” or “helping the 
experiments to work” and so on. For instance, 
earlier in the interview the evaluator criticized 
other health service studies for not having 
proper evaluation and at that point he offered 
quite a different version of scientific method. I 
had asked for his reaction to the point that at the 
CERN particle accelerators they don’t need 
evaluators and he replied, “But they’re doing 
experiments, aren’t they? ... It’s more like, say, 
well we're going to experiment, with artificial 
insemination and we’re just going to do it, okay 
and then that’s it. Well, and then the people who 
are involved can, if you care to ask them, they 
may give you their experiences of it, but I mean 
_ there is no attempt by any independent body to 
evaluate the experiment. It seems to me that is 
quite bad ... That’s the sort of thing that. is hap- 
pening in the health service all the time. People 
are going off in this direction and going offin that 
direction, if they have a nice experience they 
talk about it, if they have a nasty experience they 
talk about it, but it is totally unsystematic.” 
Researcher Two (ironically): Unlike his experi- 
ence of evaluating the clinical budgeting experi- 
ments! l 
Researcher Three: There always seems to be a 
contrast made with some hypothetical group 
who are doing things worse in terms of some 
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model of scientific method — in this case people 
having “nice experiences” and “nasty experi- 
ences” which they merely “talk about”, counter- 
posed with systematic independent evaluation. 
This appears to be the way it always works: con- 
trasting pairs of opposed versions of ‘scientific 
method are deployed, either to deconstruct an 
overly systematic version, as Wickings did to 
economics in his interyiew, by saying it is too 
mechanistic, inflexible and simplistic and — 
doesn’t take account of real social actors or 
whatever; or more usually to deconstruct soft re- 
search by saying it isn’t hard, rigorous, tested 
with independent evaluation, or whatever. Any- 
way, I'm curious to know how you got on at 
Brunel. That must have been difficult: an audi- 
ence of general sociologists, sociologists ‘of 
science and all sorts of economists — and all at 
once.’ 

Researcher One: I was trying to forget about that. 
That was another depressing experience. I gave 
the paper which dealt with all that stuff about 
clinical budgeting being a social technology — - 
you know, an attempt to change human be- 
haviour with the use of economics — and which 
treated the CASPE experiments as a test of this 
social technology. Then I went on to decon- 
struct the tests in the way I’ve talked about 
before, by a little analysis of the CASPE report 
which showed what a disaster they had actually 
been. I read out the list of the so-called “points of 
encouragement” resulting from the tests and 
there were just howls of laughter from the audi- 
ence, especially when I came to the bit about the 
ward sisters enjoying being budget holders. But 
then I got pulled apart in an odd kind of way in 
the question session, or rather pulled in several 
directions at the same time. No one seemed 
entirely happy with what I had done. In the first 
place, asociologist interested in macrosociology 
said that it was clear that the whole thing was 
really to do with the role of the state and 
Thatcher’s policy to squeeze the NHS and my 
study didn’t tell us enough about this wider con- 
text in which the experiments were taking 
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place. Then there was a sociologist of science 
who argued against this macrosociologist but 
who was also dismissive of my paper because he 
said everyone knew economics wasn’t a proper 
science and so what? Then one of the 
economists, who seemed to be something of a 
theorist, got uptight with the sociologists for say- 
ing that economics was just a matter of social 
contingency; he felt there was more to proper 
economics than that, and in a way I was in sym- 
pathy with this view because of course I wanted 
to show that there was something worth decon- 
structing in the first place. Finally, there was this 
health economist and he said that of course clini- 
cal budgeting experiments were nothing to do 
with economics, but then he said he had just 
been given a contract by the DHSS to evaluate 
the latest resource management experiments! 
(Speaking in an increasingly garbled fashion.) 
So there I was trying first of all to argue against 
the macrosociologist by pointing out to her that 
the wider social context such as Thatcherism 
only took on meaning in people’s everyday prac- 
tices, such as their experiences with these ex- 
periments, so of course I sided with the 
sociologist of science who supported me on this, 
but then found myself arguing against him by 
saying that economics was the social science 
most in need of deconstruction and hence 
ended up agreeing with the uptight economic 
theorist; then I proceeded to argue against the 
health economist who claimed clinical budget- 
ing wasn’t health economics at all by saying that 
it must be health economics because everyone 
working in health economics including himself 
always said that real health economics was what 
other people did, ha ha —- and anyway why was 
he evaluating the resource management experi- 
ments if they weren’t health economics? By the 
way, I said I would send him the final version of 
our study in case it was of any use to him in his 
evaluation. 

Researcher Two: Poor guy! 

Researcher Three: Anyway, you survived. It 
sounds to me as if you did the usual bits of 
juggling which people have to do to survive in 
the social world: for each successive practical 
occasion constructing and deconstructing ag- 
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reements and alliances by constructing and de- 
constructing versions of sociology and econ- 
omics as you go along. Of course, normally it 
isn’t such hard work because you don’t get all 
these different groups with all their different ver- 
sions together in one place. 

Researcher One: Except, perhaps, in the health 
service. 


_Researcher Two: It seems to me that we ought to 


get clear just what it is that is inadequate about 
the Brunel version of clinical budgeting. 
Researcher One: There’s no problem there. What 
was wrong was that I took only one version of 
clinical budgeting — the strong programme ver- 
sion — and deconstructed it by recovering the 
weak version. ; 


. Researcher Two: That’s not quite how I see it. I 


think what you did was to privilege the strong 
version of success, as formulated in Wickings’ 
scientific rhetoric, by using tbat to deconstruct 
the weak version of success — the ward sisters 
enjoying their budgets and so on. By doing so, 
you effectively offered your own evaluation of 
the results of the experiments: they were a 
failure and not, as the Evaluation Group and the 
ward sisters and Wickings in his sociological 
mood all thought, a (qualified) success. In short, 
far from using the weak-programme version of 
clinical budgeting to deconstruct, the strong- 
programme version, as you claimed just now, I 
think you did the exact opposite: you decon- 
structed the weak version with the strong ver- 
sion. 

Researcher One: You might be right. In any case, 
we clearly need to think hard about how we are 
going to write this stuff up for the book. But Pm 
worried that if we give equal prominence to 
both versions — á la BBC — we'll end up not 
really saying anything. 

Researcher Three: Alternatively, like the health 
economists, we could choose to present the ver- 
sion which is most suitable for the practical 
occasion at hand. 

Researcher One: Whatever that might be! 
Researcher Two: In that case we've got to make 
up our minds what the practical occasion is. I 
mean, are we doing real science, or “quick and 
dirty” policy research, or sensitive participant- 
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centred research? : science. Lets hope that by the time weve 
Researcher One: We don’t want to spend years finished our book Health and Efficiency we find 
on this so we can’t be doing that last one. And out what we are doing. 

we're not getting paid enough and have already Researcher Three: Right. So let’s think about this 
taken too much time to be doing a “quick and carefully... 

dirty”, and we’re too incompetent to be doing 
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THE IMPACT OF FEEDBACK ON INTER-RATER AGREEMENT AND SELF INSIGHT 
f IN PERFORMANCE EVALUATION DECISIONS 


PETER F. LUCKETT and MARK K. HIRST” 


School of Accounting, University of New South Wales 


Abstract 


Previous research on staff performance evaluation decisions in audit firms has indicated low inter-rater 

agreement and suggested that improvements may be achieved by providing raters with information about 

_ the firm’s official policy weighting system. In this study the impact of outcome feedback, task properties 

` feedback and a combination of both types of feedback on inter-rater agreement, conformity with the official 

. policy weighting system and self insight were investigated. Results of a laboratory experiment, using 48 

subjects from an audit firm, indicated that feedback improved inter-rater agreement and conformity with 
the official policy weighting system but did not improve self insight. 


Recently, several studies have examined the per- 
formance evaluation decision in large firms of 


Chartered Accountants (for example, Jiambalvo . 


et al., 1983; A. Wright, 1982; Kida, 1984). The 
importance of this decision in such organiza- 
tions is attributed to the central role of human 
resources in the transformation process. Jiam- 
balvo et al. (1983) identify performance evalua- 
tion decisions as playing major roles which serve 
judgmental purposes (e.g.; decisions relating to 
promotion and retention of staff), development 
purposes (e.g. facilitating long-range personnel 
planning) and motivational purposes. Evidence 
. shows that the performance evaluation process 
` is relatively standard across firms of Chartered 
Accountants (see Jiambalvo, 1979, pp. 438- 
439). Specifically, the process involves the as- 
sessment of a staff member on a number of di- 
mensions (or cues) relating to certain aspects of 
the job or activity. The number of dimensions 
can be ‘quite high. For example, the U.S. firm in 
the Jiambalvo etal. (1983) study used 40 dimen- 
. sions while the firm studied here used 14 dimen- 
sions. The ratings made on the multiple dimen- 
~ sions are then usually combined to form an over- 
all rating which involves, either explicitly or im- 
plicitly, the use of a weighting system. 
The perceived importance of the perform- 
ance evaluation decision has motivated a num- 


ber of empirical studies into the process itself, 
including assessment of weighting processes 
used by individual raters (Jiambalvo et al., 1983; 
A. Wright, 1982), inter-rater agreement (Jiam- 
balvo, 1982; Kida, 1984), self insight Jiambalvo 
et al., 1983; A. Wright, 1982), agreement with 
partners’ perceptions (Jiambalvo, 1982) and 
sub-unit differences (Jiamabalvo et al., 1983). An 
important finding has been wide differences in 
inter-rater agreement, . That ~is, the relative 
weights attached to the multiple dimensions 
when forming an overall or composite rating dif- : 
fer between raters. These différences have seri-. 
-ous implications for comparative- purposes ' 
‘where choices are made affecting staff based on 
overall evaluations made by different raters 
within the firm. Since such choicés include sal- 
ary and promotion decisions (Jiambalvo, 1979), > 
the existence of different evaluation policies 
may raise issues of equity. Furthermore, subordi-. 
nates may become confused as to what are the 
important aspects of their job. As pointed'out by | 
Kida (1984), the -performance evaluation pro- 
cess is an essential ingredient in employee moti- - 
vation and emphasis on different dimensions by ` 
different evaluators could affect the behavior of 


` subordinates. He concludes that motivation 


should be affected by “... stronger and more ac- 
curate links within an organization’s evaluation 
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system’ (p. 137). In order to achieve such links, 


Jiambalvo et al. (1983) contend that informa- 
tion about objective weights could help by mak- 
ing differences explicit to individuals and help 
them reconcile these differences. This, however, 
requires the formulation of an official weighting 
policy by senior management and communica- 
tion of the policy to individual raters. 

The effectiveness of this approach in reducing 
disagreement depends on the endorsement of 
the official weighting policy and its adoption by 
the individual raters. This in turn raises the ques- 
tion of how raters may come to “learn” a formal 
weighting system. One way of informing raters 
about the appropriateness of their evaluations is 
to provide information about the correctness of 
a particular evaluation. This is referred to as out- 
come feedback and may enable the rater to un- 
derstand better the formal weighting system. Al- 
ternatively, information about the formal prop- 
erties of the task (specifically here, the relative 
weights of the various dimensions used to make 
an overall evaluation) may be communicated 

‘prior to making evaluations. Such an interven- 
‘ tion is often reférred to as task properties feed- 


back, or alternatively, feedforward (Steinmann, - 


` 1976).! 


The purpose of this research is to investigate | 


the role of both outcome feedback (OF) and task 
properties feedback (TPF) in assisting individual 
raters to capture the formal policy weights. A 
_laboratory: experiment was used to examine the 
evaluation process and investigate the impact of 
feedback on this process. As there is some disag- 
reement about the relative effectiveness of OF, 
TPF, and OF and TPF in combination (see Har- 
<- rell, 1977; Hoskin, 1983; Kessler & Ashton, 
1981; Libby, 1981), it was felt that these three 
feedback treatments should be considered. Of 
particular interest were: inter-rater agreement 
urider different feedback conditions; the confor- 
mity of raters with the official weighting policy 
of partners, under different feedback conditions; 
and the level of understanding of their own 
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weighting systems or the self insight of raters 
under different conditions. 

In brief, the results of the study show that 
feedback promotes both inter-rater agreement 


and conformity with the official weighting sys- 


tem. In contrast, feedback does not affect self in- 
sight. However, self insight is high for all experi- 
mental groups. 

In the next section the research hypotheses 
are developed. The research method is then de- 
scribed, followed by the results of the study, 
which are discussed in the final section. 


HYPOTHESIS DEVELOPMENT 


The main focus of this study concerns the im- 
pact of feedback on inter-rater agreement. Al- 
though there are exceptions (for example, A. 
Wright, 1982), previous research (for example, 
Jiambalvo et al., 1983; Kida, 1984) has found the 
level of inter-rater agreement for staff perform- 
ance evaluation decisions to be low. For ex- 
ample, Jiambalvo et al. (1983) reported mean 
pair-wise correlations in three subunits of a US. 
Certified Public Accountants firm to be 0.61 
(audit), 0.62 (managerial services) and 0.49 
(tax). They described these correlations as 
being fairly low and argued that “... there would 
be some difficulty in making an overall evalua-, 
tion that would be acceptable to all members of 
the subunit” (p. 20). However, in these studies, 
subjects did not receive information about any ` 
formal official weighting policy..Apart from 
some common experience among subjects with 
the type of task being used, it is unclear why high 
inter-rater agreement would be expected. 

On the other hand, provision of feedback 
could be expected to assist raters in understand- 
ing the formal policy weights. If this is the case, 
then those receiving feedback would be ex- 
pected to exhibit higher inter-rater agreement 
than those not receiving such information. 

The reason for expecting higher inter-rater ag- 
reement, given feedback, is that raters are able to 


’ Task properties feedback is seen as an important alternative to other types of feedback, and is often combined with other 
types of feedback. For a more detailed discussion of different feedback types, see Hammond & Summers (1972) and Kessler 


& Ashton (1981). 
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learn to use the same (or a similar) set of weights 
to those that constitute the official policy 
weighting system for the evaluation task. That is, 
greater inter-rater agreement would be highly 
correlated with a convergence between an indi- 
vidual’s personal weighting system and the for- 
mal weighting system. It is hypothesized that: 


H1: The level of inter-rater agreement is higher for those 
groups receiving feedback than the group receiving no 
feedback. 

H2: The level of conformity with the official policy 


weighting system is higher for those groups receiving 
feedback than the group receiving no feedback. 


As well as studying inter-rater agreement; pre- 
vious studies have considered the level of self in- 


sight displayed by individual raters. Self insight - 


refers to the degree to which an individual is 
able to describe the weighting system actually 
being used. Findings with respect to self insight 
by members of Chartered Accounting firms in 
performance evaluation studies have been 
mixed, ranging from low (A. Wright, 1982) toa 
“fair amount of insight” (Jiambalvo et al., 1983). 
In general, however, findings have been consis- 
tent with those reported in the psychology liter- 
ature in that individuals tend to overestimate the 
importance of minor dimensions whilst under- 
estimating the importance of major dimensions 
(Slovic & Lichtenstein, 1971; Zedeck & Kafry, 
1977): 

Although there has been substantial investiga- 
tion into the level of self insight shown by raters, 
there is little evidence about the factors which 
may affect it. It is argued here that task proper- 
ties feedback may improve the level of insight. 
While, as hypothesized above, both types of 
feedback may be expected to bring the indi- 
vidual’s actual weighting system closer to the of- 

ficial weighting system, it does not necessarily 
follow that the individual will perceive this to be 
the case: This is especially so for outcome feed- 
back. As outcome feedback indicates the cor- 
rectness of an evaluation, the rater may act as if 
he/she understands the appropriate ranking of 
the dimensions but still subjectively overesti- 
mate or underestimate the importance of vari- 
ous dimensions. In contrast, where task proper- 
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ties feedback is provided, this involves the direct 
communication of the relative weights of the di- 
mensions used in the overall evaluation. To the 
extent that such raters are in agreement with the 
official policy weighting system (see H2 above), 
reporting that they use this system would be ex- 
pected to result in.a high level of self insight. It is 
hypothesized that: 


H3: The level of self insight is higher for those groups re- 
ceiving task properties feedback than those groups that 
do not receive task properties feedback. f 


RESEARCH METHOD 


Subjects 

Participants in the experiment were 
employees from the Sydney office of one of the 
Big 8 firms of Chartered Accountants. All 48 sub- 
jects were engaged in audit work and held one of 
the following positions — audit supervisor (n = 
17, 35% ), senior accountant (n = 13, 27% ), or 
senior assistant (n = 18, 38% ). In this firm, audit 
supervisors have at least five years audit experi- 
ence, senior accounts at least three years and 
senior assistants are graduates with a minimum 
of one year. Part of their duties involved the pre- 
paration of performance evaluations of subordi- 


" mates and all subjects had participated in staff 


training related to such evaluations. The experi- 
ment involved four treatment conditions (see 
below) and subjects were randomly assigned to 
groups with 12 subjects per group. 


Experimental task 
The performance evaluation task used in the 


experiment required subjects to make an overall 


rating (on a 1—1 1 scale) for each of 32 hypothet- 
ical audit assistants based on their performance 
on five cues or dimensions [technical skills (Cue 
1), motivation level (Cue 2), communication 
skills (Cue 3), time control (Cue 4) and analyti- 
cal skills (Cue 5)]. Performance on each cue had 
been previously rated on a two-level basis — 
satisfactory (S) or unsatisfactory (U) — for each 
case. Thus, the 32 cases represented a full facto- 
rial design and ensured that cue intercorrela- 
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tions were equal to zero. 
Official policy statement weights 

In order to provide feedback, a set of “official” 
policy weights was required. For outcome feed- 
back “correct” evaluations were based on these 
weights and for task properties feedback the 
weights themselves were communicated to sub- 
jects. The official policy weights were derived 
by first asking three partners in the firm studied 
to assign a total of 100 points among the five 
cues according to the relative importance they 
attached to each. The weightings of the three 
partners were then averaged and the resulting 
relative weights were referred to in the experi- 
mental materials as the “official policy state- 
ments weights”. Composite reliability (Holsti, 
1969) was 0.87, indicating a high level of relia- 
bility among the three partners (Krippendorff, 
1980). 


Procedure 

Subjects were randomly assigned to one of 
four treatment groups, which determined the 
type of feedback they received. Subjects as- 


_ Signed to the control group or no feedback 


a Meeta 
ae i 


group (NF) received no feedback. Those in the 
outcome feedback group (OF) had the “correct” 
evaluation based on the official policy weights 
indicated to them after each case was evaluated. 
In the case of the task properties feedback group 
CTPF), subjects were given details of the official 
policy weights while the combined feedback 
group (OF + TPF) received both types of feed- 


Jback as just described, In the case of OF, subjects 


` were also shown an example of a “correct” 


' evaluation before commencing the cases to en- 


sure comparability with the TPF group where of- 
ficial policy weights were given prior to any 
evaluations. 

Subjects were provided with a written ‘de-. 
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scription of the task and details of the feedback 
to be given appropriate to the treatment condi- 
tion, as well as a booklet of the 32 cases. After 
making the evaluations, subjects completed a 
post-test questionnaire which asked them to as- 
sign 100 points to the five cues according to the 
relative importance they felt they attached to 
the cues during the evaluations. These weights 
are referred to below as the subjective weights.” 


Variable measurement 

Three ‘variables were used in the analysis: 
inter-rater agreement, conformity with the offi- 
cial policy weighting system and self insight. In 
order to measure inter-rater agreement, pair- 
wise correlations of the 32 overall evaluations 
for all possible pairs of subjects in each treat- 
ment group were computed. For each group this 
meant the calculation of 66 correlation coeffi- 
cients (possible combinations of pairs = n!/(n— 
r)irl where n = number of subjects per group 
and r = number of subjects per combination ).? 
The mean of these 66 pair-wise correlations pro- 
vides a measure of average inter-rater agreement 
for each treatment group (see Jiambalvo et al., 
1983). 

To measure conformity with the official pol- 
icy weighting system and the level of self insight 
it was first necessary to establish the weighting 
system used by each subject. This involved the 
computation of a multiple regression equation 
for each subject. In formulating each equation 
the dependent variable was the overall evalua- 
tion made by the subject (an integer value from 
1 to 11) and the independent variables were the 
scores (S$ = 1 or U = 0) pre-assigned to the 
hypothetical audit assistants on the five cues. 

Next, in order to facilitate a comparison of the 
weights actually accorded to each cue by a given 
subject with the subjective weights reported 
and with the set of official policy weights, rela- 


2 Copies of the research instruments are available upon request from either author, School of Accounting, University of New 


South Wales, P.O. Box 1, Kensington, 2033, Australia. 


3 although inter-rater agrecment is usually examined using correlational analysis it is subject to a potential validity threat as 
the level of inter-rater agreement can be inflated by disregarding any between-judge mean difference (see Bartko & 
Carpenter, 1976; W. F. Wright, 1982). This problem is of particular concern when focussing on the absolute level of inter- 
rater agreement. Here, however, concern is with the relative levels of inter-rater agreement across treatment groups. 
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tive weights (Hotman, 1960) were computed as. 
follows: 


Wis = Bi Tid R? s 
where RW, = relative weight of éth cue for sub- 
jects 
Ba = beta weight of ith cue for subject 
s 
fẹ = validity coefficient for #th cue for 
subjects 
s = square multiple correlation co- 
efficient for subject s. 


The sum of the relative weights over the five 
cues for each subject equal 1 and were multip- 
lied by 100 for direct comparison with subjec- 
tive and official policy weights. RW, indicates the 
contribution of each cue as a proportion of the 
predictable linear variance. In this study, as the 
set of cues were orthogonal, Bs = Ty (see 
Hoffman 1960).* 

In order to assess the conformity of the sub- 
jects’ weighting systems with the official policy 
weighting system, two “discrepancy” measures 
were computed (see Summers et al., 1970). The 
first, mean absolute error (MAE) is the sum of 
the absolute differences between a subject’s ob- 
jective weights and the official policy weights 


averaged over the five cues. The second, the. 


square root of the mean square error (MSE) is 
the square root of the average of the squared dif- 
ference between a subject’s objective weights 
and the official policy weights. Self insight was 
measured in a similar way. Both MAE and MSE 
were computed using the objective weights and 
the subjective weights of subjects. It should be 
noted that the larger (smaller) MAE or VMSE, 
the lower (higher) the conformity with the offi- 
cial weighting system or the level of self insight. 


- Data analysts 
To test for differences among the four feed- 
back treatments, one-way ANOVAs were per- 
formed, one for. each of the five measures de- 
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scribed above. Where significant differences 
were found, individual comparisons between 
groups were made using Tukey’s “honestly sig- 
nificant difference” technique (Hays, 1961). As 
the measures of inter-rater agreement were cor- 
relation coefficients, they were transformed to 
Fisher Z scores prior to conducting the ANOVA 
(see Hays, 1961). 


RESULTS 


Inter-rater agreement 
The results of a one-way ANOVA performed to. 
investigate differences among feedback treat- 
ments support H1 (F3269 = 31.73, p < 0.001). 
The mean pair-wise correlations for the four 
treatment groups are 0.78 (NF), 0.87 (OF), 0.89 
(TPF) and 0.92 (OF + TPF). Individual compari- 
sons using Tukey’s “honestly significant differ- 
ence” test (Hays, 1961) show differences be- 
tween the NF group and each of the feedback 
groups but no significant differences between 
ee three feedback groups (q4 = 3.69, d. f. = 260, 
= 0.088, p < 0.05). In short, the results 
sae that, irrespective of type, feedback ee 
otes inter-rater agreement. 


Agreement witb official policy weighting 
system ; 
The results also support H2. A one-way 
ANOVA shows significant differences among the 
treatments on both measures of conformity with 
the official policy weighting system (MAE: F; 44 
= 8.33, p = 0.001 and VMSE: F; 44 = 10.05, p = 
0.001). Group means for these two measures are 
shown in Table 1. Recall that the means indicate 
the amount by which each objective cue weight 
departed, on the average, from the weights 
suggested by the official policy for each group, 
with larger mean values indicating a greater de- 
gree of departure. Individual comparisons com- 
puted using Fukey’s “honestly significant differ- 


í Since the set of cues was orthogonal, the zero-order validity coefficients between each cue and the overall evaluation could 
be used as indicators of the relative importance of the cues (see Zedeck & Kafry, 1977, p. 278). Relative weights were used, 
however, as they are directly comparable with the subjective weights and offictal policy weights and are more intuitively 


comprehensive. 
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ence” technique (Hays, 1961) reveal differences 
between the NF and-each of the feedback 
groups, but no differences between the three 
feedback groups (MAE: q; = 3.79, d f. = 44, mse 
= 16.72, p < 0.05 and VMSE: q4 = 3.79, d.f. = 
44, mse = 43.42, p < 0.05). Again, the results 
show that, irrespective of type, feedback prom- 
otes conformity with the official policy weight- 
ing system. 


TABLE 1. Group means for measures of agreement with offi- 


cial weighting system 
NF OF TPF OF + TPF 
MAE 13.10 8.01 6.34 5.49 
VMSE 15.83 9.35 7.61 6.16 
Self insigbt 


The results do not support H3. Self insight is 
not significantly different across groups (MAE: 
F; 44 = 0.344, p > 0.05 and VMSE: F; 44 = 0.287, 
p> 0.05). Cell means for each measure of self in- 
sight are shown in Table 2. They indicate the 
amount by which each objective cue weight de- 
parted, on the average, from the subjective 
weights for each group. As before, the larger the 
mean value, the greater the degree of departure. 


_ TABLE 2. Group means for measures of self insight 








NF OF TPF OF + TPF 
MAE 6.88 6.01 6.08 6.54 
VMSE 8.18 7.19 7.48 7.63 
DISCUSSION 


The first important finding in this study is that 
the level of inter-rater agreement was higher for 
those groups receiving feedback about the offi- 
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cial policy weighting system. This was the case 
irrespective of the manner in which the feed- 
back was given. The difference between these 
groups and the NF group appears to be due to 
the greater variety of evaluation models used by 
the NF group. That this was the case can be as- 
certained by examining the objective weights 
used by subjects in each group. Table 3 sum- 
marizes information about objective weights in 
terms of the frequency with which the five cues 
were given highest and lowest weightings by 
subjects on a group basis. First, from Table 3, it 
can be seen that there was a greater variety of 
weighting systems in the NF group than the 
three feedback groups since the frequency with 
which cues were given highest and lowest 
weightings was spread more evenly over the five 
cues. Second, note that the cue weighted highest 
and the one weighted lowest for the three feed- 
back groups were the same (Cue 5 and Cue 4, re- 
spectively). This indicates not only greater ag- 
reement about relative cue weights within each 
feedback group but also agreement between 
groups. Finally, the combination of Cue 5 (high- 
est) and Cue 4 (lowest) occurred most fre- 
quently: seven times (OF), nine times (TPF) and 
ten times (OF + TPF). 

It should, however, be noted that, following 
Jiambalvo et al. (1983), the lower level of inter- 
rater agreement in the NF group could be at- 
tributable to subjects being less consistent when 
applying their evaluation policies. Further analy- 
sis of the data, however, showed that consis- 
tency was constant across the groups. The multi- 
ple correlation coefficient, R, indicates the con- 
sistency with which the subject has apparently 
applied a linear evaluation model when asses- 


TABLE 3. Frequency of cues given highest and lowest weights by group* 





Criterion 
Highest weight Lowest weight 
Group 1 2 3 5 1 2 3 4 5 
NF 7 5 2 1 0 2 4 2 2 2 
OF 2 1 i o 7 0 1 0 9 2 
TPF 1 1 1 0 9 0 2 0 11 0 
TPF + OF 0 0 2 0 10 0 0 0 12 0 





* Where total exceeds 12 (cell size), more than one cue had the highest or lowest weight. 
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sing the 32 cases.” Multiple correlation coeffi- 
cients were computed for each subject. The 
mean multiple correlation coefficients for the 
groups were 0.94 (NE), 0.97 (OF), 0.95 (TPF) 
and 0.96 (OF + TPF). Importantly, a one-way 
ANOVA indicated no significant differences 
among the four groups (F344 = 1.83, p > 0.05). 
The lower level of inter-rater agreement in the 
NF group cannot be attributed to subjects being 
less consistent in applying their evaluation mod- 
els. 

The second important finding is that the in- 
crease in inter-rater agreement can be attributed 
to an understanding of the official policy weight- 
ing system. That is, there was a greater level of 
conformity with the weighting system for those 


subjects recéiving feedback than for those in the. 


NF group. A comparison of mean actual cue 
weights for each group with official policy 
weights provided further support for this claim 
(see Table 4). In particular, the NF group rated 
Cue 5 (analytical skills), which was the most im- 
portant cue in the official policy weighting sys- 
tem, as least important. In contrast, the feedback 
groups rated this cue as most important (OF, 
equally most important) in conformity with the 
official policy weighting system. It should be 
noted that the ranking exhibited by the OF 
group was similar to that of the official policy 
weighting system. This is of particular interest 
since this group did not directly receive infor- 
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mation about the official policy weighting sys- 
tem. Not surprisingly, the rankings for the two 
groups receiving TPF were identical [allowing 
for equal rankings for Cue 1 (technical skills) 
and Cue 2 (motivation level) in the official pol- 
icy statement]. 

As H3 was not supported, it- is necessary to 
consider why feedback, particularly TPF, had no 
effect on self insight. The level of self insight ac- 
ross groups was investigated further by correlat- 
ing the five subjective weights and five objective 
weights for each subject and computing the 
mean product-moment correlation coefficient 
for each group. This analysis indicates that the 
level of self insight was high for all groups. The 
mean product—moment correlation coefficients 
were: 0.78 (NEF); 0.76 (OF); 0.86 (TPF) and 0.80 
(OF+ TPF).® These means, however, should be 
interpreted cautiously given the small number 
of observations (i.e. n = 5) used to calculate . 
each coefficient. Given the level of these means, 
it appears that ceiling effects could have pre- 
cluded feedback (particularly TPF) from affect- 
ing self insight. That is, as self insight is relatively 
high for the NF group, there is little room for im- 
provement when feedback is provided. Future 


. research could investigate the impact of feed- 


back where, in the absence of feedback, self in- 
sight is relatively low. 

As with all experimental studies, the findings 
here must be interpreted in the light of certain 


TABLE 4. Mean actual cue weights by group compared with official policy weights 











Group 

Official : NF _ OF TPF OF +TPF 

policy 
Cue Weight Rank Weight Rank Weight Rank Weight Rank Weight Rank 
1 17 3 30 1 25 1 20 3 12 4 
2 17 3 22 2 18 4 13 4 14 3 
3 24 2 18 3 21 3 23 . 2 29 2 
4 10 5 15 4 li 5 5 5 6 5 
5 32 i 15 4 25 1 39 1 39 1 














> In terms of the regression form of the lens model this is R, where R, = ryp (see Libby, 1981). 


é The level of self insight exhibited by each group was quite high. Even so, a table listing both objective and subjective weights 
for each cue by subject (which is available from the authors on request) shows a tendency to overestimate minor cues and 
underestimate major cues. These findings are consistent with Jiambalvo et al (1983). 
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` limitations. The use of a factorial design is sub- 
ject to external validity threats. Of particular 
concern here is the possibility of an unrepresen- 
tative proportion of extreme cases which may 
have led to an overstatement of self insight.” As 
indicated above, this may have limited the effect 
of feedback on self insight. Also, use of a factorial 
design usually restricts the number of cues that 
can be used. As stated in the introductory com- 
ments, the number of cues used by audit firms 
for staff performance evaluation purposes is 
often greater than five. Not only is the cue set 
larger in practice, but also it is likely that a num- 
ber of these cues will be correlated. Under such 
conditions it can be shown that agreement using 
different weighting systems can be quite high 
(Ashton, 1979; Einhorn & Hogarth, 1975). Fi- 
nally, performance evaluation systems often 
have more than two levels of assessment on each 
individual cue (five levels were used by the firm 
studied). While this would be expected to make 
the evaluation task more complex, the impact on 
the current results is difficult to predict. 

These limitations restrict the practical signifi- 
cance of the results in this study. Even so, the re- 
sults indicate that feedback is useful in assisting 
the individual rater to “learn” the official policy 
weighting system. The type of feedback to be 
used in a practical setting would depend on a no- 
tion of cost effectiveness. Where partners find it 
difficult to formulate a specific set of weights, 
training sessions using outcome feedback based 


on the partners’ evaluations for the same cases, 


would seem appropriate. Of course, this would 
be dependent on agreement among partners as 
to the appropriate overall rankings on the cases 
being evaluated. On the other hand, where it is 
possible to formulate the set of weights, these 
could be communicated directly to raters (for 


example, on a 100 point scale). Finally, training 
sessions. could incorporate both types of feed- 
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back. The provision of outcome feedback, after 


each evaluation, to raters who have already re- 


ceived task properties feedback has been found 
to help overall evaluation performance (see 
Hirst & Luckett, 1987; Harrell, 1977). 

Another interesting finding from a practical 
view point is that feedback changed the raters’ 
weighting systems quite dramatically. As was 
shown in Table 4, the NF group weighted Cue 5 
(analytical skills) as the least important cue 
along with Cue 4 (time control). In contrast, all 
feedback groups regarded Cue 5 as being the 
most important cue, and this ranking was consis- 
tent with that of the official policy weighting sys- 
tem. The fact that there was this marked differ- 
ence between the NF group and the feedback 
groups indicates the power of feedback to influ- 
ence the preferences of raters, and supports the 
use of feedback as an important intervention 
capable of changing the beliefs of individuals. 

In conclusion, the findings of this study 
suggest a number of avenues for further re- 
search. For example, while the results here indi- 
cate that various types of feedback are equally ef- 
fective over the set of 32 cases, the relative effec- 
tiveness of different feedback types over time 
could be investigated. Research by Hirst & Luc- 
kett (1987) shows that the impact of TPF on task 
learning is immediate, while learning takes place 
over a period of time in the case of OF. More re- 
search of this nature would be useful in identify- 
ing the most appropriate training methods 
where cognitive tasks are involved. 

Further, the importance of different weighting 
systems on overall evaluations in actual perform- 
ance evaluation settings could be studied. Some 
research (for example, Ashton, 1979; Einhorn & 
Hogarth, 1975) has shown that overall evalua- 
tions become less sensitive to different weight- 
ing systems with increasing numbers of cues, 
higher levels of mean cue intercorrelation and 


7 This can also affect the level of inter-rater agreement. However, as noted before, concern in this study is with relative 


agreement across groups. 


8 While the results of this study indicate that the type of feedback does not affect the ability of raters to understand the official 
policy weighting system, the group means reported in Table 1 suggest higher conformity where TPF was provided. Following 
Hirst & Luckett (1987), this may be due to the Immediate impact of TPF in learning the weighting system as compared to 
learning over time in the case of OF. Given sufficient cases, however, both types of feedback might be equally effective. 
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lower variability of cue weights, relative to their 
mean. By using actual cases rather than a facto- 
rial design, the sensitivity of various weighting 
systems in practice could be gauged. The ap- 


less costly to develop and operate than more 
complex systems (Ashton, 1979, p. 174). In ad- 
dition, the relative effectiveness of feedback in 
assisting subjects to learn different weighting 


propriateness of simple unit weighting systems systems could be investigated. 
could be investigated as such systems may be : i 
BIBLIOGRAPHY 


Ashton, R. H., Some Implications of Parameter Sensitivity Research for Judgment Modeling in Accounting, 
The Accounting Review (January 1979) pp. 170-179. 

Bartko, J. J. & Carpenter, W. T., On the Methods and Theory of Reliability Journal of Nervous and Mental 
Disease (1976) pp. 307—317. 

Einhorn, H. J. & Hogarth, R. M., Unit Weighting Schemes for Decision Making, Organizational Bebavior 
and Human Performance (April 1975) pp. 171-192. 

Hammond, K. R. & Summers, D. A., Cognitive Control, Peyerological Review (VoL 79, No. 1, 1972) pp. 58— 
67. 

Harrell, A. M., The Decision-making Behavior of Air Force Officers and the Management Control Process, 
The Accounting Review (October 1977) pp. 833-841. 

Hays, W. L., Statistics, 3rd Ed. (Tokyo: Holt—-Saunders, 1961). 

Hirst, M. K. & Luckett, P. F., Task Learning and the Effectiveness of Different Types of Feedback in 
Performance Evaluation Judgments (1987), Working Paper, University of New South Wales. 

Hoffman, P. J., The Paramorphic Representation of Clinical Judgment, Psychological Bulletin (March 
1960) pp. 116-131. 


Holsti, O. R., Content Analysis for the Social Sciences and Humanities (Reading, MA: Addison-Wesley, - 


1969). 

Hoskin, R. E., Opportunity Cost and Behavior, Journal of Accounting Research (Spring 1983) pp. 78-95. 

Jiambajvo, J., Performance Evaluation and Directed Job Effort: Model Development and Analysis in a CPA 
Firm Setting, Journal of Accounting Research (Autumn 1979) pp. 436-455. 

Jiambalvo, J., Measures of Accuracy and Congruence in the Performance of CPA Personnel: Replications and 
Extensions, Journal of Accounting Research (Spring 1982) pp. 152—161. 

Jiambalvo, J., Watson, D. J. H. & Baumler, J. V., An Examination of Performance Evaluation, Acoli: 
Organizations and Society (1983) pp. 13—29. 

Kessler, L. & Ashton, R. H., Feedback and Prediction Achievement in Financial Analysis, Journal of 
Accounting Research (Spring 1981) pp. 146—162. 

Kida, T. E., Performance Evaluation and Review Meeting Characteristics in Public Accounting Firms, 
Accounting, Organizations and Society (1984) pp. 137—147. 

Krippendorff, K., Content Analysis: An Introduction to Its Metbodology (Beverly Hills, CA: Sage, 1980). 


Libby, R., Accounting and Human Information Processing: Theory and Application (Englewood ik E 


NJ: Prentice-Hall, 1981). 

Stovic, P. & Lichtenstein, S., Comparison of Bayesian and esia ppa o a 
Processing in Judgments, Organizational Bebavior and Human Performance (1971) pp. 649—744. 

Steinmann, D. O., The Effects of Cognitive Feedback and Task Complexity in Multiple-ciie Probability 
Learning, Organizational Bebavior and Human Performance (1976) pp. 168-179. 

Summers, D. A., Taliaferro, J. D. & Fletcher, D. J., Subjective vs Objective Description of Judgment Policy, 
Psychonomic Science ( 1970) pp. 249-250. 

Wright, A., An Investigation of the Engagement Evaluation Process for Staff Auditors, Journal of Accounting 
Research (Spring 1982) pp. 227-239. 

Wright, W. F., Comparison of the Lens and Subjective Probability Paradigms for Financial Research 
Purposes, Accounting, Organizations and Society (1982) pp. 65—75. 

Zedeck, S. & Kafry, D., Capturing Rater Policies for Processing Evaluation Data, Organizational Bebavior 
and Human Performance (April 1977) pp. 269-294. 


Accounting, Organizations and Society, VoL 14, No. 5/6, pp. 389-413, 1989. 


Printed in Great Britain 


0361—-3682/89 $3.00+.00 
Pergamon Press pic 


THE TAXMAN COMETH: SOME OBSERVATIONS ON THE INTERRELATIONSHIP | 
. BETWEEN ACCOUNTING AND INLAND REVENUE PRACTICE* 
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Abstract 


Foucault, Discipline and Punisb: The Birth of the Prison, (1977a) portrays the operation of society as an 
exercise of disciplinary technologies. Adopting this position, the author seeks to explore the British Inland 
Revenue’s use of techniques of registration, categorization and surveillance which have the effect of 
subjecting organizations and their accounting practice to a’compulsory visibility. The application of this 
disciplinary technology is seen to influence, intentionally or otherwise, the adoption and development of 
particular forms of accounting practice in the organization under study. The accounting process is seen to 
be inextricably involved in the Revenue’s powers and practices, both as the focus of the Revenue's interest 
and as a facilitative technology which renders the financial transactions visible to the Revenue’s gaze. 


On July 12 (1979) Judge Leonard, the Common Sergeant, 
sitting at the Central Criminal Court, issued four warrants 
under section 20C (of the Taxes Management Act, 1970) 
authorizing the search of four premises. Those premises 
were the offices of Rossminster Ltd, the home of Mr 
Ronald Anthony Plummer, the Managing Director, the 
offices of AJR Financial Services Ltd, who provided secre- 
tarial and accounting services to the Rossminster group 
of companies and the home of Mr Roy Clifford Tucker, a 
chartered accountant with a business relationship with 
the group. 

The next day at 7 a.m. officers of the Revenue accom- 
panied by police officers went to the homes of Mr Pium- 
mer and Mr Tucker and to the offices of Rossminster and 
AJR Financial Services to execute the warrants. They 
waited at the offices until employees arrived to let them 
in, but at the homes of Mr Plummer and Mr Tucker they 
demanded admittance at 7 a.m. Virtually all the papers 
and documents that Mr Plummer’s house contained were 
seized and removed. Much the same occurred at Mr 
Tucker's house. Van loads of documents were taken from 





Rossminstet’s offices; the officers remained on the pre- 
mises all day. Much the same occurred at -AJR’s offices. 
The Revenue refused to say what offence was alleged to 
have been committed or by whom (Tbe Times, 14 De- 
cember 1979, p. 12). _ 


The British Inland Revenue’ (more com- 
monly referred to as the “Revenue”), as the 
above passage reveals, has an interest in the ac- 
tivities of commercial ‘organizations. Further- 
more, the search and seizure of documents from 
the offices of AJR Financial Services, which per- 
formed Rossminster’s accounting functions, 
suggests that one of the Revenue’s specific con- 
cerns is with the financial activities and transac- 
tions of organizations. Such a focus suggests that 
there may be a relationship between the powers 
and practices of the Revenue and the practice of 


“The author wishes to thank the reviewers of Accounting, Organizations and Society and the Interdisciplinary Perspectives 
on Accounting Conference 1988. The author also wishes to thank Peter Miller, Richard Macve and David Cooper for their 


comments and encouragement. 


All names have been changed to preserve the anonymity of the persons and organizations involved. 


' The British Inland Revenue is responsible for the administration and collection of taxes, most notably income and 
corporation tax. Two other institutional bodies namely the Department of Trade and Industry, which regulates and monitors 
the activities of organizations and the Customs and Excise, which is responsible for the administration and collection of value 
added tax are also referenced in the paper. Both of these institutional bodies are part of the judicial state apparatus and are 
involved in the policing of the financial transactions of organizations. 
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accounting in organizations. An exploration of 
. this relationship is the focus of the present analy- 
sis. 

Within the accounting literature the relation- 
ship between the powers and practices of the 
Inland Revenue and the practice of accounting is 
largely ignored. Accounting has restricted itself 
to the technical computation and recording of 
taxation liabilities or with the way tax investiga- 
tions are conducted (Helsby, 1986). The study 
of the relationship between Revenue and ac- 
counting practice has become more relevant 
since the Revenue modified its investigation 
techniques in 1976 (Reader, 1981, p. 5). Rather 
than a cursory glance of all submitted tax returns 


~ «= referred to as the “annual review” — with 


periodic “critical reviews”, the Revenue has 
adopted a policy of conducting investigations 
into a predetermined percentage of companies. 
These investigations include an in-depth exami- 
nation of the companies’ accounts and “an in- 
quiry into the records and underlying informa- 
tion from which they are constructed” (Reader, 
1981, p. 6). The Revenue has therefore “moved 
‘away from being a technically oriented body to 
primarily an investigative agency” (Helsby, 
1986), resulting in an “increasingly adversarial 
world of tax investigations” (Helsby, 1986). As 
evidence of the accounting profession’s concern 
with the Revenue’s new position, there appears 
to be an increasingly active attempt to forge a 
discursive relationship between these two in- 
stitutions (for example, see Accountancy, Au- 
gust and November 1983, January and August 
1984, May and August 1985 and August 1986). 
Further evidence of the accounting professions’ 
concern with the powers and practices of the 
Revenue and their attempts to influence them, 
may be found in CCAB? Technical Releases (for 
example, 246,309 and 358). It is suggested here 
that a fruitful research project would be to focus 
upon the emergence of these discourses which 
posit an interrelationship between the Revenue 





2 Consultative Committee of Accountancy Bodies. 
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and the accounting professions. Another ap- 
proach to researching this interrelationship 
would be to draw upon the work of Foucault 
(1977b) and examine the historical conditions 
of possibility for the emergence of the taxation 
of corporate income (originally introduced as a 
war tax (Sabine, 1983, p. 3) and the calculation 
of tax liability based upon accrual accounting 
measures of income, in an attempt to reveal how 
the two practices became interconnected.’ 

The value of such historical case studies is that 
they represent a methodology for examining the 
emergence and development of the relationship 
between accounting and Revenue practice at 
the macro-level. However, the emphasis of the 
present analysis is upon the origins and develop- 
ment of accounting practice, as influenced by 
the Revenue, at the organizational or micro- 
level, rather than with the historical emergence 
of these practices per se. The concern of the 
study is to provide an in-depth analysis of the 
Revenue’s involvement in corporate affairs by 
examining the specific mechanisms employed 
by the Revenue, intentionally or otherwise, 
through which individual organizations come to 
adopt and develop paruicniar forms of account- 
ing. 

The following analysis is strongly influenced 
by the work of Michel Foucault (my debt to him 
will be explained in a subsequent section). How- 
ever, rather than adopting an historical or 
archaeological methodology, the study employs 
a single organizational case to explore an en- 
counter with the powers and practices of the Re- 
venue in order to reveal their influence upon the 


` organization’s accounting system. While the or- 


ganization and its directors are seen as the site 
for the exercise of the Revenue’s powers and 
practices, the research extends beyond the com- 
pany itself and explores the powers and prac- 
tices more generally. Legal statutes, Revenue in- 
vestigations and newspaper reports are used to 
illustrate the Revenue’s powers and practices in 


* For studies that adopt an historical analysis, see Burchell et al (1985), Loft (1986), Hoskins & Macve (1986), Hopwood 


(1987) and Miller & O'Leary (1987). 
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order to reveal how they operate and possibly 
intersect with the accounting practice of organi- 
zations. Although differing methodologically 
from Foucault’s work, the study takes as its 
model a position adopted by Foucault (1980a), 
namely: 


For myself I prefer to utilize the writers I like. The only 
tribute to thought such as Nietzsche is precisely to use it, 
to deform it, to make it groan and protest. And if the com- 
mentators say that I am being faithful or unfaithful to 
Nietzsche, that is of absolutely no interest (Foucault, 
1980a, p. 54). 


THE CONTEXT OF THE STUDY 


The field study, upon which the present analy- 
sis is based, was conducted in a company called 
Axis Records Limited, an independent record 
producer in the British music industry with a 
turnover of £1.5 million per annum. The direc- 
tors and founders of the company, Michael 
Needham and Steve Jackson, claimed to operate 
on a different set of premises to that of the major 
record producers (the “Majors” ) and indeed the 
majority of commercial organizations. It was this 
avowed difference that made the organization 
attractive as a research site. The directors drew 
a sharp distinction between the “Majors” which 
they claimed emphasized profit over musical 
quality and innovation and the “Independents”, 
such as themselves, which emphasized aesthe- 
tics’ over economics. They attempted to 
operationalize this ideal by adopting a number of 
practices which were distinctly different to 
those employed by the Majors. For example, no 
band or artist was legally contracted to Axis. This 
stood in stark contrast to the Majors who held 
their artists to long term contracts as a means of 
protecting their “investments”. No advances 
‘were paid to artists to lure them to Axis, a prac- 
tice intended to ensure that artists were commit- 
ted to their music. Those primarily interested in 
economic gain would be attracted by the 
“cheque waving” of the Majors. Axis did not 
employ any overt marketing strategies to prom- 
ote records, but rather relied on the artists to 
promote their own music through concert 
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tours.. In addition, Axis would only distribute 
their records through independent record dis- 
tributors. This preserved their independence 
from the Majors who owned the major distri- 
bution companies. Moreover, the major 
distributors refused to distribute small quan- 
tities of records, which were regarded by them 
as “uneconomic” but which were necessary for 
the promotion of new and innovative music. Fi- 
nally, payments to the band were based on a 50/ 
50 split of profits on each individual record. In 
contrast, the Majors reimbursed artists on a per- 
centage of sales. The 50/50 split meant that if the 
record did sell well, the artists would have a 
much greater share of the income; if it did not, 
Axis would absorb the loss. 

In its early years Axis had a history of success- 
ful records, almost entirely attributable to one 
band. To survive, an independent record pro- 
ducer has to have at least one successful band 
whose records regularly make it into the top 40 
of the singles chart or the top 20 of the album 
chart. The average contribution margin on 
album sales for Axis was high; approximately 
77% of sales which meant that successful pro- 
ducts were highly profitable. The company was 
cash rich in its early years and this enabled the 
directors to promote innovative music and 
absorb losses from financially, although not mus- 
ically, unsuccessful records. This was in keeping 
with the ideal of promoting aesthetics over 
economics. In some senses it also permitted the 
directors the luxury of ignoring the economic 
consequences of many of their actions. As Steve 
Jackson commented: 


Almost all businesses where you buy something and sell 
it, your profit margin is maybe 20%. But in a record com- 
pany you pay 60p and you're getting back 2.20, which is 
a big profit margin. I don’t think I would be a businessman 
if I had to keep tight margins. If you have a record that 
does well then its big money and no problems. 


The directors’ adherence to these principles led 
one national newspaper to describe Axis as “a 
wilfully successful record company”. 

The operation of the company was also un- 
usual Axis formed part of the overall indepen- 
dent music business which was made up of a 
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number of individual companies, each perform- 
ing a specific function in the manufacture and 
sale of records. Axis effectively subcontracted 
out the entire production and sales process to 
these companies. Axis'’s bands recorded and 
mixed’ their music at independent recording 
studios. Master tapes of these recordings were 
then made into a metal disc at another company. 
This “metal” was then used to stamp out the 
vinyl albums and singles at a separate pressing 
plant. At the same time, independent graphic 
artists were commissioned to design the sleeves 
and labels which were then printed at a printing 
works. The printed sleeves would be forwarded 
to the pressing plant where they would be used 
to pack the vinyls and. the completed product 
would be sent to the independent distributors to 
be distributed to the retail outlets. Axis’s in- 
volvement at each stage was that of trouble 
shooting delays in order to meet the all import- 
ant release dates and in approving the master 
tapes, the sleeve design and the quality of the 
pressings. This was done in conjunction with the 
bands. In order to receive any copies of the 
finished product, Axis had to instruct the dis- 


-tributors to send copies to their offices. Axis it- 


self operated from a small office since its princi- 
ple activity was to seek out and promote musical 
talent. 

The working environment in Axis’s office was 
untypical of commercial organizations. It was 
often highly chaotic, yet the chaos appeared to 
be deliberately cultivated. The directors de- 
scribed their approach as creating a specific am- 
biance out of which innovative ideas and par- 
ticularly innovative music would emerge. Their 
philosophy was informed by the “situationist 
international” movement (see Knabb, 1981). 
Work was continually punctuated by visits from 


artists, listening to new music, discussing per- | 


formances, deciding on release dates, agreeing 
on sleeve design as well as chasing production 
runs and organizing recording sessions at one of 
the independent studios. 

Although Axis was an unusual organization in 
which to conduct research, it offered a number 
of intriguing opportunities. The emphasis on 
aesthetics over economics and the directors’ 
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attempts to provide an appropriate ambiance, 
raised the question of what role accounting 
would play in such an organization. The incon- 
gruity of “accounting for aesthetics” presented 
an opportunity to research accounting in a con- 
text where conventional interpretations might 
not hold. The unfamiliarity of Axis’s operations 
could be used to wrench apart or question the 
familiar and taken-for-granted assumptions 
about the role of accounting in organizations. 

In contrast to the seemingly anarchic order of 
Axis’s operations, its financial accounting pro- 


. cess was remarkably conventional. A part-time 


bookkeeper would arrive each Monday and 
meticulously record the previous week’s finan- 
cial transactions in a double-entry bookkeeping 
system. The transactions were recorded in 
purchase and sales journals and posted to the 
general ledger each week. The existence of such, 
meticulous books of account stood in stark con- 
trast to the general activities of Axis’s office. The 
accounting system was an island of methodical 
order in an otherwise chaotic setting. 

The incongruity of the conventional account- 
ing system in such an unconventional organiza- 
tion presented itself as a piece of puzzling datum 
(Silverman, 1985) which prompted an explora- 
tion of the reasons for its existence. The direc- 
tors’ answers to my enquiries on this point were 
unequivocal: the double-entry bookkeeping sys- 
tem and the bookkeeper were installed because 
of the Taxman (a common colloquialism for the 
Revenue). 

In 1982 (prior to the commencement of this 
study), the directors had changed accountants. 
The new accountant recommended that he in- 
stall a new bookkeeping system and supply a 
bookkeeper from his employ, one day per week, 
to maintain the books. Both directors agreed 
that they had accepted the accountant’s recom- 
mendation “without debate”. Michael Needham 
commented: 

“He (the accountant) told us we needed to keep proper 
books. I never believed that the stuff (the original sys- 
tem) that Steve did was any good, but that was his job. I'm 


the A&R man.” (This stands for Artiste and Repertoire 
and entails identifying new bands.) 


It was Steve Jackson’s clear understanding that 
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the books of account were required by the Re- 
venue. He justified the new system and the pre- 
sence of the bookkeeper in the following man- 
ner: 


That’s what I am paying them for (the accountant and the 


bookkeeper)... for them to do the books in such a way ` 


that they would be able to relate to the Inland Revenue 
and VAT according to the rules. Fve learned the rules of 
my business ... I expect to pay someone who has learned 
the rules of accountancy and would deal with them (the 
Revenue and Customs and Excise). 


Coincidentally, during the research, Axis’s major 
band was investigated by the Inland Revenue. 
The investigation révealed that the band had 
failed to record accrued revenue and that Axis 
was the debtor. This event also had a significant 
impact on the accounting process at Axis. Steve 
Jackson regarded this as a particularly significant 
event in Axis: 


the big threat from the Revenue was the change in our 
major group's accounting system. It was a major event of 
Axis... financially . . . probably the single most important 
event ... the insistence of the Inland Revenue that the 
band be taxed on an accrual basis. That has caused a 


whole massive series of problems. You suddenly change 
the accounting system and things are ina complete sham- 
bles. ` 


Both of these events suggested a distinctive line 
of inquiry: to explore the mechanisms the Re- 
venue employs, intentionally or otherwise, to in- 
fluence the bookkeeping and -accounting prac- 
tice of organizations. The Revenue is seen as 
_ being centrally involved in both the origin and 
the development of the accounting process at 
Axis. This is not to suggest that it was the only 
factor. For example, the bookkeeper noted that 
exactly the same double-entry system was intro- 
duced in all of the accountant’s small and 
medium sized clients. Focusing in on this obser- 
vation might lead one to conclude that the new 
system was really to serve the interests of the ac- 
countant. While not dismissing this possibility, 
nor other possible explanations, the point re- 
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mains that the directors construed the new 


. system in terms’ of the requirements of the Tax- 


man and this aspect is therefore worthy of re- 
search. 

The observation that the directors construed 
the introduction of double-entry in terms of the 
Taxman raises the question of how the Revenue 
interacts at the micro-level with individuals and 
organizations to influence accounting practice. 
To explore this issue, I rely heavily on the con- 


cept of disciplinary power as developed in the 
work of Michel Foucault. 


JURIDICIO-DISCURSIVE POWER OR 
DISCIPLINARY PRACTICE 


Foucault (1977a) makes a distinction be- 
tween sovereign power and disciplinary power.‘ 
Sovereign power is identified with the capacity 
(literally or metaphorically) to lay down the law 
and hence with persons, institutiorial bodies or 
forces who possess this capacity; historically the 
monarch or the judicial state apparatus in con- 
temporary society. The practice of sovereign 
power is based upon rules legislating what is for- 
bidden and upon the punishment of those who 
transgress them. The law of the land impinges 
upon those who defy it. Sovereign power is 
therefore negative and repressive. F 

Foucault defines the conception of sovereign 
power as juridicio-discursive and suggests that 
the language and imagery of the law dominates 
representations of power and the established 
order, even though they have been largely 
eclipsed by another and more contemporary 
form of power which he refers to as disciplinary 
power. 

The study of disciplinary power is not con- 
cerned with centralized or legislative forms of 
power, but rather with the application of tech- 
niques or specific practices, aimed at rendering 
the activities of individuals and populations gov- 
ernable (Miller, 1987). Central to these prac- 


* Sovereign power has received considerable attention and so will only be briefly mentioned here (see Dreyfus & Rabinow, 
1982; Minson, 1985; Smart, 1985; Miller, 1987 and within an accounting context, Hoskins & Macve, 1986; Miller & O'Leary, 


1987). 
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tices is surveillance which Foucault refers to as 
“a mechanism that coerces through the play of 
the gaze” (Foucault, 1977a). However, it is a par- 
ticular form of surveillance that Foucault is 
alluding to, hence the use of the word “gaze”. 
The aim of the gaze is “so to arrange things that 
the surveillance is permanent in its effects, even 
if it is discontinuous in its action, that the perfec- 
tion of power should tend to render its actual 
exercise unnecessary”. In short, those who are 
the subject of disciplinary power are “caught up 
in a power situation which they themselves are 
the bearers” (Foucault, 1977a). 

The ideal representation of disciplinary 
power and the one Foucault uses as an analogy, 
is Bentham’s Panopticon (Foucault, 1977a). In 
the panopticon we can see an architectural ex- 
. pression of a mechanism for the observation, 
examination and regulation of people’s lives. 
_ The panopticon is a series of rooms or cells con- 
structed in a circle such that they face a central 
observation tower. The cells, illuminated from 
the rear, silhouette the occupants and render 
their actions visible to the observation tower at 
all times. Although the panopticon is only an 
analogy, it reveals the two central elements of 
disciplinary power: compulsory visibility and 
continuous and anonymous surveillance. 

Mechanisms of surveillance and visibility, 
which include techniques of registration, obser- 
vation and investigation are not confined to the 
inmates of prisons and other total institutions of 
which the panopticon is an ideal architectural 
form: On the contrary, Foucault’s contention is 
that we are all caught up in a disciplinary system, 
although less obvious in its workings than in the 
prison, which has the effect of controlling our 
behavior without our knowing it. Further, sur- 
veillance is not restricted to architectural con- 
structions but may manifest itself in other forms, 
as I intend to demonstrate within the context of 
the powers and practices of the Revenue. Finally, 
surveillance, within the disciplinary system, is 
not necessarily physically intrusive, indeed its 
principal effects are achieved “through its invisi- 
bility” , yet at the same time it “imposes on those 
whom it subjects a principle of compulsory visi- 
bility” (Foucault, 1977a). 
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Although the preceding paragraph constitutes 
avery brief account of a highly complex process, 
it nevertheless provides a framework for posing 
the following question. May the operations of 
the Revenue, both in general and in the manner 
in which they impacted upon the decision to in- 
troduce double-entry books of account in Axis, 
be characterized by juridicio-discursive power 
or disciplinary practice or possibly a combina- 
tion of both? 


The Revenue’s powers and practices 

The Revenue has a considerable array of legis- 
lative powers, including a requirement for all 
companies to submit annual returns and, with 
judicial approval, the right to search and seize 
documents from business premises (see Taxes 
Management Act, 1970 and various Finance 
Acts, notably the 1972 and 1976 Acts). The 
Revenue may therefore be interpreted as part of 
the judicial state apparatus and characterized by 
its legally sanctioned powers to enforce the law 
of the land and punish (or at least bring before 
the courts to punish) those who transgress it. 
Given these powers, one might expect that the 
manner in which the Revenue would influence 
bookkeeping and accounting practice would be 
through specific legislation. Yet, nowhere in the 
Revenue’s legislation is there any requirement 
to record financial transactions in a double-entry 
format. Nor does the Inland Revenue rely upon 
other statutory bodies to legislate to this effect. 
Neither the Department of Trade and Industry, 
nor the Customs and Excise require a double- 
entry bookkeeping format. The most detailed re- 
ference to bookkeeping requirements is to be 
found in the Companies Act of 1985 where it is 
stipulated that a company must keep “proper re- 
cords” sufficient to exhibit and explain the com- 
pany’s.transactions, to disclose its current finan- 
cial position and to enable its directors to pre- 
pare annual profit and loss accounts and balance 
sheets giving a true and fair view of the com- 
pany’s state of affairs (Companies Act, 1985, s. 
221175). 

Given the absence of specific legislation con- 
cerning double-entry, one might speculate that 
other mechanisms, less visible and obvious than 
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statutory requirements, impacted upon the di- 
rector’s decision to introduce the new system in 
Axis. Indeed, the Revenue’s activities are not re- 
stricted to legally sanctioned powers but in- 
clude the application of specific practices, some 
of which are made public in “Inland Revenue 
Statements of Practice”. These practices and 
others which are not made public, may be inter- 
preted as part of a disciplinary technology based 
upon the principles of visibility and surveillance. 
This is not to discount the Revenue’s legislative 
powers, but to suggest that they either consti- 
tute part of the disciplinary technology or are 
combined with disciplinary techniques to form 
what is referred to officially as the “Revenue’s 
Powers and Practices”. 

The absence of specific legislation on book- 
keeping requirements should not be taken as 
evidence of the Revenue’s disinterest in such 
matters. On the contrary, the Revenue de- 
monstrates a considerable interest in the book- 
keeping and accounting practice of organiza- 
tions. The following section explores this in- 
terest; in short, it examines the focus of the 
Revenue’s gaze. 


THE REVENUE’S GAZE 


It is hardly surprising that the Revenue has an 
interest in the financial transactions of commer- 
cial organizations, given that its purpose is to ad- 
minister and collect taxes as computed from the 
financial transactions of organizations. The num- 
ber of investigations conducted by the Revenue, 
including the celebrated Rossminster case (see 
Tutt, 1985), is evidence of the Revenue’s in- 
terest in and its powers to seize and investigate 
books of account (Taxes Management Act, 1970, 
s. 20') if there is a suspicion that the financial 
transactions recorded within are subject to 
error or deliberate fraud. 

The Revenue in its role of administrator and 
collector of taxes, requires each commercial 


organization to submit annual returns which in- 


clude an analysis of revenues earned and ex- 
penses incurred and which are normally derived 
from and supported by audited annual accounts 
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(Finance Act, 1972, s. 14, para. 1'). The prepara- 


tion and submission of returns and accounts may 
be seen as a facilitative technology which brings 
the financial conduct of organizations within the 
Revenue’s gaze (Foucault, 1977a). The accounts 
and returns render visible (Hopwood, 1987b) 
the financial performance and position of a com- 
pany, as well as the extent of its taxation liability 
for scrutiny by officers of the Revenue. 

The content of annual accounts and returns 
suggests that the Revenue’s gaze is focused upon 
broad aggregates of revenues and expenses and 
assets and liabilities. However, accounts and re- 
turns may be seen as merely representations, in 
a summarized and aggregated form, of the indi- 
vidual financial transactions recorded within a 
company’s books of account. Given that ac- 
counts and returns are representations, there is 
the possibility that they may also be misrep- 
resentations. It is the misrepresentation of finan- 
cial transactions, particularly those resulting in 
an understatement of taxable income, that is the 
object of the Revenue’s interest; it is on the 
space between representation and misrepresen- 
tation that the Revenue’s gaze is focused. 

The continuing and increasing evidence of 
misrepresentation, referred to as tax evasion, 
constitutes the justification for the creation and 
use of the Revenue’s enforcement powers and 
practices. For example, the Public Accounts 
Committee (1981) noted that: 


the proportion of cases examined in which understate- 
ments of taxable profits have been detected has con- 
tinued to increase steadily in recent years; from 73 per 
cent in 1977 to 85 per cent in 1980 (quote taken from 
Dodd, 1983). 


This knowledge, inter alia, was used in the Keith 
Commission (1983) to justify the enhancement 
of the Revenue’s powers (although at the same 
time arguing for increased clarification). After 
the extension of their powers a new set of mea- 
sures emerged to legitimize them. For example: 


The Treasury reported that since 1979, the number of 
staff investigating tax evasion had risen steadily, as had 
their productivity. In 1979, 1650 people were employed 
on investigative work, raising an additional £100m in ad- 
ditional taxes or £60,000 per head. By 1983 the staff had 
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increased to 2495 resulting in £344m or £138,000 per 
head of extra taxes (The Times, 3 April 1984). 


Misrepresentation may occur through error or 
deliberate fraud, that is, through the omission or 
falsification of transactions or through a differ- 
ence of interpretation. Differences of interpreta- 
tion may occur over such accounting concepts 
as materiality, conservatism and the timing of 
revenues and expenses. The accounting treat- 
ment of particular transactions, may therefore 
result in multiple interpretations of the com- 
pany’s financial position and performance. As 
Tricker (1975) (quoting Gambling, 1971) 
suggests: 


Although most accountants in practice would claim ob- 
Jectivity and that they report the facts, the truth is that the 
form and content ofaccounting reports represent a series 
of value statements. 


Misrepresentation may be confirmed or dispel- 
led by the Revenue’s examination of a com- 
pany’s “underlying records”. Many of the Re- 
venue’s enforcement powers (Taxes Manage- 
ment Act, 1970) are designed to facilitate the 
process of examination. Therefore the annual re- 
turns and accounts render visible a company’s 
books of account; it is the underlying records 
and the manner in which they are maintained 
that will be the principle object of any investiga- 
tion. (The nature of Revenue investigations will 
be examined in a subsequent section.) Books of 
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account are thus repositories of financial trans- 
actions, which may be interrogated to deter- 
mine the validity (representativeness) of annual 
returns and accounts. 

In this sense, the Revenue’s gaze dissects the 
annual accounts and penetrates down to the 
basic elements: the financial transactions and the 
books of account in which they are recorded. 
Although the specific interrogation of a com- 
pany’s underlying records may never occur, its 
possibility has the effect of elevating the impor- 
tance of the content of the books of account and 
the form in which they are maintained. As the 
Chairman of the Inland Revenue commented: 


It must follow that accounts built up by qualified 
accountants’ from complete and reliable taxpayer's re- 
cords will usually show a result which does not arouse 
the inspector's curiosity and will accordingly be less 
likely to be chosen for investigation, possibly for many 
years (emphasis added, quote taken from Dodd, 1983). 


The above quote raises the question of whether 
the original bookkeeping system in Axis would 
satisfy the Revenue’s definition of complete and 
reliable taxpayer’s records. This question may 
never be fully answered, for as I have noted, the 
Revenue does not explicitly stipulate bookkeep- 


ing requirements. 


Before double-entry 

Prior to double-entry, Steve Jackson who was 
in charge of finances, kept minimal records. Pay- 
ments, which were limited to eight major 


* The Revenue's gaze is also focused on the activities of the accountant involved in the preparation and auditing of the annual 
accounts and returns. The initia! focus of any Revenue investigation is concerned to “establish the extent and value of the 
records available and ubat the accountant did to convert these into balanced accounts” (CCAB, Technical Release 246, 
para. 8, quoted from Reader, 1981, 9.38, emphasis added). The accountant’s activities are therefore expressly considered and 
rendered visible along with the accounts and bookkeeping records. The accountant becomes accountable to the Revenue for 
-his/her involvement in the accounting process of organizations. Moreover, the focus of the Revenue’s gaze encompasses the 
auditing process. Auditing in its capacity of establishing the “truth and fairness” of company accounts provides them with 
added legitimacy. The Revenue respects the legitimizing process involved in auditing to the extent that Beckett & Sabine 
(1987) note that when investigating a company’s accounts, the inspector 

will look for the full requirements of the Companies Act, 1981, ss. 1—4, 13—16 and sch. 1 to have been complied with as 

well as the CCAB’s Auditing Standards and Guidelines (p.14). 


In effect the Revenue employs the auditing process as a means of verifying the legitimacy of particular accounts and returns. 
In this respect the auditing process, with or without the connivance of the accountant or accounting profession, becomes 


part of the Revenuc's disciplinary technology. (More research is required in this area.) 
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suppliers, were listed chronologically and coded 
according to which albums or singles they re- 
lated. Records were made at the time of pay- 
ment. The unpaid invoices, filed chronologi- 
cally, constituted the directors’ list of creditors. 
Employees’ wages were prepared by the accoun- 
tant and directly debited from Axis’s bank ac- 
count. The two distributors which Axis used 
provided a detailed monthly breakdown of sales 
by, individual product and a record of stock 
levels at the end of the month. These statements 
were filed and used by Axis as the sole record of 
sales and stock. l 

The major use of accounting information was 
to work out, at six monthly intervals, the pay- 
ments owning to the bands according to the 50/ 
50 split of profits. No regular product costings 
were maintained and as a consequence, Steve 
Jackson would reluctantly calculate what he re- 
ferred to as the “band accounts”. (A similar, 
although less complex exercise was performed 
each quarter to calculate VAT returns. ) The sales 
revenue was drawn from the distributors’ sales 
statements and costs taken from the chronologi- 
cal listings. General administrative costs were 
not allocated to products; Axis absorbed these. 
The band accounts were essentially cash flow 
calculations; no attempt was made to match 
revenue with expenses. At this time the direc- 
tors had no understanding of accrual account- 
ing. 

For the directors and the other members of 
the office, the original system provided much of 
the information they required. Steve Jackson 
described the record business as being very sim- 
ple. He commented: 


I always see the business as being y'know buying a piece 
of paper (the sleeve) off someone for 20p and buying a 
piece of plastic (the vinyl) off someone else for 30p add 
10p of recording time and getting something that costs 
60p and selling it for about 2 quid. Its absolutely as simple 
as thar. 


The cost structure at Axis was relatively simple. 
The major cost categories were general ad- 
ministration, recording and mixing, making the 
metal, sleeve and label design, printing, pressing 
and packing. As records were ordered in 
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batches, all of these costs were essentially fixed. 
All costs except for general administration, were 
referred to as “outlays”. Break-even was expres- 
sed in terms of how many records were needed 
to be sold to cover outlays and this figure was 
fairly constant. If sales of a product did not reach 
break-even level, Steve Jackson would not trou- 
ble to calculate the extent of the loss. However, 
losses were not completely ignored as evi- 
denced by the following interaction: 


Michael Needham commented. “I don’t give 2 damn 
about money, its the cultural happening that counts.” In 
reply, Steve Jackson said. “No! We're interested in loss 
minimization, not profit. We’re interested in break-even, 
well no... in fact we're interested in making lots of 
money to use it elsewhere... but overall just breaking 
even.” 


There was no formal system of budgeting in Axis, 
Steve Jackson believed that it was futile to 
budget for “something like music where you 
have absolutely no idea how many copies you 
will sell”. However, some form of control (al- 
though never expressed as such) was exercised. 
For certain bands who were unlikely to break- 
even, the amount of recording and mixing studio 
time was restricted, less expensive graphic 
designers were selected and only a minimal 
number of pressings would be made. All these 
actions, although not quantified financially, had 
the effect of keeping the “outlays” down. The 
bands were aware of these restrictions and a 
considerable amount of jealousy and conflict 
was caused by them. The success of a record was 
judged by its position in the charts and critical 
acclaim in the music press rather than by any fi- 
nancial measure. 

In summary, the accounting system appeared 
to be constructed upon the basis of the minimal 
bookkeeping effort. From a conventional book- 
keeping perspective, the records were deficient. 
Being single-entry, there were no checks and ba- 
lances to detect error or possible fraud and, al- 
though the directors were careful to retain and 
file all receipts and sales statements, the books of 
account were not complete and rarely up to 
date. The minimal effort required on a day-to- 
day basis meant that a concerted effort was re- 
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quired to produce quarterly VAT returns, band 
accounts and to prepare and audit annual ac- 
counts and tax returns. In many respects it was 
not surprising that the accountant recom- 
mended the new system. Moreover, given the 
directors distate of bookkeeping, the recom- 
mendation to install a bookkeeper was advisable 
and greeted with pleasure by the directors them- 
selves. The directors claimed, however, that the 
original records were sufficient to provide the 
minimal financial information they required. It 
appears unlikely that they would have intro- 
duced a double-entry system at that time had it 
not been for the accountant’s recommendation 
and their understanding that a “proper” system 
was required by the Revenue. 

The Chairman of the Inland Revenue’s em- 
phasis upon complete and reliable taxpayer’s re- 
cords suggests that the Revenue’s gaze is focused 
upon the bookkeeping and accounting practices 
of organizations. However, what is not clear is 
how the Revenue, intentionally or otherwise, in- 
fluences these practices. Given the absence of 
specific legislation, the analysis has suggested 
that the Revenue operationalizes its gaze 
through the exercise of a disciplinary apparatus, 
based upon the principles of visibility and sur- 
veillance. The following sections explore these 
issues. 


THE PRINCIPLE OF VISIBILITY: THE BODY 
CORPORATE AND THE BODY OF THE 
DIRECTORS 


Foucault suggests that “one of the prime ef- 
fects of power is that certain bodies, certain 
gestures, certain discourses, certain desires 
come to be identified and constituted as indi- 
viduals” (1980b, p. 98). Although Foucault is 
principally concerned with the individualized 
person, it is possible, in the context of the Re- 
venue’s powers and practices, to view the indi- 
vidualizing process operating at two levels 
within the organization: the “body corporate” 
and the “body of the directors”. 


The body corporate 
A company, like the family (Donzelot, 1979), 
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may be seen as an aggregate body, or “artificial 
person”. A company is referred to as the body 
corporate in English Law and is constituted as an 
individual legal, accounting and taxable entity 
and is therefore a possible site for the exercise of 
power. A number of disciplinary techniques may 
be identified at the level of the organization 
which are intended to render its activities visi- 
ble. 

The Revenue requires each organization to re- 
gister with it (Finance Act, 1976). This is also 
true of other statutory bodies including the De- 
partment of Trade and Industry (Companies Act, 


.1985, s. 10) and the Customs and Excise (Value 


Added Tax Act, 1983, s. 27). The process of 
registration casts organizations as individual 
cases whereby they may be assigned to specific 
categories (on the basis of their company status, 
turnover and nature of business ) and from there 
be more precisely monitored. Organizations are 
then placed within the administrative structure 
of the Revenue and are assigned to a local office 
which may permit even more intimate observa- , 
tion. 

The constitution of companies as individual 
cases enables the Revenue to construct and 
maintain a unique record of, or compile a dossier 
on, each organization on its register (Wheeler, 
1969). The Revenue refers to these records as 
the “taxpayer’s file”. Central to this process of 
compiling dossiers is the legal requirement to 
submit annual returns to the Revenue (Finance 
Act, 1972, s. 14, para. 1'). As noted above, these 
returns, normally accompanied by a copy of the 
annual accounts, calculate the taxable income of 
the company prepared according to the accrual 
accounting concept. Tax returns may be seen as 
a specific form of account derived from and, in 
turn, possibly influencing accounting technol- 
ogy. They are also an example of how account- 
ing, with or without the connivance of accoun- 
tants or the accounting bodies, serves particular 
interested parties, in this case the Revenue 
(Hopwood, 1987a). 

Annual accounts and returns do-not, however, 
constitute the only source of data. Dodd (1983) 
notes that “there is a variety of paper which may 
reach the taxpayer’s file and invite comparison 
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with the return” (p. 2) including amongst other 
items, interest payments from banks, informa- 
tion gathered from newspapers, from investiga- 
tions of related companies, from investigations 
conducted by other statutory bodies and from 
paid informers. The Revenue is empowered to 
reward informers up to the sum of £50, how- 
ever, as Beckett & Sabine (1987) note, often the 
informer’s “reward is the discomfiture or 
pecuniary loss of his victim” (p. 14). Therefore, 
the compilation of dossiers makes each indi- 
vidual company a case to be known (Dreyfus & 
Rabinow, 1982). 

The submission of annual returns to the 
Revenue forms part of a vast and elaborate pro- 
cess of documenting the financial activities of 
each registered organization. In addition to the 
Revenue, companies are required to submit re- 
turns to the Customs and Excise (Value Added 
Tax Act, 1983, s. 14 sch. 2') and audited ac- 
counts to the Department of Trade and Industry 
(Companies Act, 1985, s. 363’? and sch. 15). 
Combined, the submissions constitute an elab- 
orate apparatus not only to observe the activities 
of organization but also to meticulously docu- 
ment and permanently record them. The 
Revenue’s powers therefore, are derived not 
only through the play of the gaze, but also 
through the reading of documents. (Some impli- 
cations of this apparatus will be discussed in the 
final section.) 

The disciplinary effect of documentation is 
enhanced by combining and comparing annual 
_ returns and accounts with those held by the 
other statutory bodies. Each of these returns are 
constructed from the same source, namely the 
company’s books of account. Comparison, 
which was only permitted since 1972 (Finance 
Act, 1972, s. 127) and constitutes a considerable 
extension of these institutional bodies’ powers, 
provides a triangulated view of the company’s fi- 
nancial transactions, making it easier to detect 
errors or fraud and rendering the organization 
more visible. The pooling of individual dossiers 
places organizations in a vast web of highly 
specified data. The Keith’ Committee (1983) 
noted that of 120 cases of undeclared VAT re- 
ported by one district of the Customs and Excise 
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_to the Inland Revenue, 60 cases were also found 
to have understated profits. 

The registration, categorization and place- 
ment of companies and the preparation and sub- 
mission of returns and accounts may be seen as 
part of a disciplinary apparatus which facilitates 
and refines the focus of the Revenue’s gaze. The 
application of these techniques may be inter- 
preted as an attempt to lay bare the organization, 
to render it transparent and to observe within, 
the detail of its activities. 


The body of the directors 

Foucault's later work (1977a, 1980d) is 
largely concerned with examining the 
emergence of the modern individual as a docile 
and mute body by showing the interplay and 
correlative development of disciplinary 
technologies and bodies of knowledge, in par- 
ticular the normative social sciences (Dreyfus & 
Rabinow, 1982). Foucault notes that in con- 
stituting individuals as subjects of power they 
are also objects of knowledge: 


power and knowledge directly imply one another; that 
there is no power relation without the correlative con- 
stitution ofa field of knowledge, nor any knowledge that 
does not presuppose the constitute at the same time 
power relations (1977a, p. 249). 


The emergence of the social ‘sciences 
(psychiatry, psychology, demography, statistics, 
criminology, social hygiene, etc.) is closely 
linked to the development and spread of discipli- 
nary technologies. Through this interplay, a 
“technology of the body” emerged in which indi- 
viduals came to be constituted as objects of 
knowledge and sites for the exercise of power. 
In Foucault’s analytic, power is seen to operate 
upon the human body. Disciplinary techniques 
are applied at the level of the individual: 


The point where power reaches into the very grain of in- 
dividuals, touches their bodies and inserts itself into their 
actions and attitudes, their discourses, learning processes 
and everyday lives (Foucault, 1980d, p. 39). 


Within an organization the directors are the per- 
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sons who are singled out for specific individuali- 
zation. Directors are statutorily required to re- 
gister themselves with the Revenue and other 
statutory bodies (notably the Department of 
Trade and Industry and the Customs and Ex- 
cise). Through such legislation, directors are 
constituted as individual subjects and become 
possible sites for the exercise of power. 

The individualizing process of the directors is 
reinforced through legislation which governs 
their activities and their interaction with the 
company. For example, legislation prohibits 
directors from taking unfair advantage, notably 
receiving tax free payments (Companies Act, 
1985, s. 311). Legislation also covers share deal- 
ings by directors and their families and restric- 
tions on a company’s power to make loans to 
directors and persons connected with them 
(Companies Act, 1985, s. 232 and 337). More- 
over, during an investigation of a company, the 
personal finances of a director may be investi- 
gated as well as those of their direct relatives 
(Taxes Management Act, 1970, s. 20). 

The individualization of directors is complex 
in that it tethers the individual to the body cor- 
porate. Individuals, in their capacity as directors 
of a company may be seen to have dual bodies 
(Turner, 1984); they are embodied individuals 
and also are constituted as embodiments of the 
company. As such, the directors are marked out 
as the particular category of corporate member 
to be held responsible and accountable for the 
activities of the body corporate. Because of the 
relationship between the body of the directors 
and the body corporate, techniques applied at 


the level of the company may also have their ` 
effect at the level of the directors. In this respect, 


by laying bare the activities of the organization, 
the activities of the directors are also rendered 
visible. 

The directors’ visibility is made compulsory 
and reinforced according to a predetermined 
time table. Directors are required to sign and de- 
liver to the Revenue annual tax returns (Taxes 
Management Act, 1970, s. 1087). In addition, 
directors are required to file a copy of the annual 
accounts with the Registrar of Companies (Com- 


" panies Act, 1985, s. 298 and 241 ), submit a direc-. 
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tors’ report with the annual accounts and pro- 
vide details of their emoluments (Companies 
Act, 1985, s. 5, part V). Such requirements not 
only link the directors with the activities of a 
company but, more specifically, link them to the 
content of the annual accounts, returns and re- 
ports which in turn reflects their own financial 
integrity and conduct. 

Although they did not express it in terms of 
compulsory visibility, the director of Axis were 
conscious of their responsibility and vulnerabil- 
ity as directors despite the anarchic and sup- 
posedly leaderless style of the company. Their 
awareness stemmed from the accountant’s in- 
structions at the time of incorporation. As 
Michael Needham commented: 


When we became a company we had to do all sorts of 
things. Have a registered office, appoint an accountant, 
keep accounts, which was Steve's job, we're even sup- 
posed to have an annual meeting. And if anything goes 
wrong its always the directors ... its always us that’s 
going to be on the line. 


SURVEILLANCE: THE POSSIBILITY OF 
INVESTIGATION 


In the previous section the process of indi- 
vidualization was examined as a way of marking 
out directors as possible sites for the exercise of 
power and rendering their activities and the | 
activities of the organization visible. In this sec- 
tion the mechanisms of how the visibilities 
created in the accounting process, the organiza- 
tion and in the directors’ activities are linked 
with techniques of surveillance are examined. 
The analysis centers upon the effect of compul- 
sory visibility and surveillance from the per- 
spective of the directors of Axis, who were the 
subjects of these disciplinary techniques. 

As noted earlier, surveillance is not necessar- 
ily physically intrusive, indeed its principal ef- 
fects are achieved “through its invisibility”. In- 
depth investigations by the Revenue, although 
increasing, are limited. The Revenue only inves- 
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tigates 1% of companies and 3% of non-com- 
panies (Committee of Public Accounts, 1980). 
This strategy reflects an economy of power, that 
is, power exercised to a sufficient degree to 
“have its most intense effect on those who have 
not committed the crime. ... the potential crimi- 
nal must be convinced that the crime will be 
detected and punished” (Sheridan, 1984). The 
application of an economy of power through the 
threat of detection and punishment is a central 
element in creating a sense of continuous and 
anonymous surveillance, even though discon- 
tinuous in its action. 

The threat of detection and punishment by 
the Revenue is reinforced through the media as 
can be seen from the following headlines in The 
Times: “Tax Evaders Face Tough New Rules” (24 


March 1983), “Fiercer Hunt for Tax Dodgers. 


(evaders )” (26 July 1984) and “Inland Revenue 
Plans to Put Squads of Investigators on the Trail 
of Tax Dodgers” (25 May 1985). More specific 
material is found in the professional accounting 
journals with Accountancy running a section on 
tax matters in each issue. Such publicity serves 
to create an almost mythological imagery and is 
reinforced by reports of the Revenue’s success. 
For example, “Revenue’s Tax Chasers Raise 
£138,000 Each” (The Times, 3 April 1984) and 
“Tax Fraud Yield Is Doubled” (The Times, 5 June 
1984). 

It would appear from the above headlines that 
the powers and practices of the Revenue are 
directed towards deterring tax evasion and fail- 
ing that, detecting and punishing tax evaders. 
However, Foucault (1977a, 1980d) suggests 
that the basic goal of disciplinary power is to 
produce a “docile body” in order to render the 
behavior of individuals and populations govern- 
able. The Revenue’s powers and practices may 
have the less obvious effect, if not the intention, 
of creating a docile and accountable taxpayer, 
that is, a taxpayer who keeps proper records, 
submits accurate returns and pays their taxes. 

Of course the Revenue is not all powerful. It 
has a number of statutory limitations imposed 
upon it, for example, the requirement to apply 
for a warrant before searching premises. More- 
over, a plaintiff has the right to appeal against the 


401 


manner in which the enforcement powers of the 


Revenue are applied. The Rossminster case is an 
example of this. Judgments in favor of the plain- 
tiff are not uncommon. Nevertheless, the pow- 
ers and practices of the Revenue are consider- 
able and in some respects exaggerated by the 
manner in which they are presented. Revenue 
law is replete with phrases such as “whenever 
there is good reason”, “in the inspector’s reason- 
able opinion” and “reasonable cause to believe”. 
Such phrases provide considerable latitude for 
the Revenue to interpret situations and make 
prediction of its actions precarious, even by ex- 
perts. In a sense, the Revenue is to be 
experienced but not to be known. 

There is, in fact, a continuing discourse be- 
tween the Revenue and the CCAB in order to 


‘seek clarification (see Techncial Releases 246, 


309 and 358) of its interpretation of these am- 
biguous phrases. Indeed the Keith Committee 
(1983) recommended that “enforcement 
powers should be precise, and logically formu- 
lated (and) the scope for administrative discre- 
tion should be reduced to a minimum”. It added 
that the Revenue’s “methods are antediluvian” 
and that “much more should be done to describe 
to the taxpayers the nature of the procedure”. 
The disciplinary ideal of creating the effect of 
continuous surveillance may never be fully 
realized. Resistance to the powers and practices 
of the Revenue is possible. The directors of Axis 
could have decided to keep no retords as a pos- 
sible means of avoiding the Revenue’s gaze 
rather than developing more sophisticated ones. 
Resistance to the powers and practices of the 
Revenue are evidenced by the incidence of tax 
evasion cited above. However, Foucault 
(1977a) suggests that resistance itself forms an 
integral part of a disciplinary technology. The 
mode of exercising power may be “charac- 
terized according to the nature of the resistances 
it produces, confronts, fixes in place and man- 
ages” (Minson, 1985, p.48). In this respect, 
methods of tax collection and investigation may 
be seen to promote resistance, in the form of tax 
evasion, which may then be fixed in place, quan- 
tified and thereby managed. The Revenue’s pow- 
ers and practices may be seen to be involved in 
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the production of “tax dodgers” in order to bet- 
ter understand and manage them.° 


Tbe directors’ experience of the Revenue 

_The introduction of double-entry books of ac- 
counts in Axis was not based upon a reasoned 
examination of the legal requirements and prac- 
tices of the Revenue. When the new system was 
introduced the directors were simply not con- 
versant with these. As Steve Jackson com- 
mented: 


I mean there are all those books aren’t they . . . you know, 
piles of the rules, They require it to be done in their ways 
_or whatever. I know that the Vatman requires us to keep 
receipts and I know that the Inland Revenue can go 
through the books if they want, but that’s about the ex- 
tent of my knowledge. 


The directors were however conscious of the 
- Taxman’s presence. Michael Needham experi- 
enced the presence of the Revenue most 
acutely. He conjured up a spectre like image; an 
entity without definite form but nevertheless 
with very real consequences. He was concerned 
about the possibility of an investigation and the 
consequences to him as an individual. He ex- 
pressed his concern in the following manner: 


The one thing that bothers me about the Taxman is the 


bailiffs coming here. Now something is going to have to. 


be done about that. I don't want to see them here (his 
bome). I don’t want them to know about all this (his 
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home and possessions) and I don’t want to have any per- ° 


sonal Liabilities. : 


He referred to the Taxman as: 


“not a very nice gentleman, someone you want to avoid.” 
and then jokingly added. “Yeah, like a black belt in ka- 
rate!” (I asked him why) “Well ... because they will 
always find something on you.” 


Steve Jackson was less concerned about the 
Revenue. He saw them as: 


Quite eager. Like any human being they probably like a 
chase, love battle and contest and all those things. I 
deduce that they like winning and don’t like losing. 


He saw the Revenue as having rules which, pro- 
viding you knew them and complied with them, 
posed no threat. Alternatively, you could hire an 
accountant or bookkeeper who knew the rules 
to act on the directors’ behalf. In reference to 
this point Steve Jackson said: 


“When starting a company, the only way its going to con- 
tinue is if the Taxman or Vatman allows it to continue, so 
you have to conform to their rules. And you hire an ac- 
countant who knows their rules,” and added “If Pm not 
going to get involved in tax fiddles . .. its better to do it 
straight. If at the end of the day there is £37 worth of pro- 
fit. | write that down in my book and the Inland Revenue 
man decides what he charges me and the accountant tells 
me what he thinks I should pay. I don’t get much more in- 
volved than that. It should be that simple.” 


© Typically the tax dodger is portrayed as the villian and the Revenue as the protector of the citizenry as a whole (Levi, 1982). 
In justifying the introduction of the Enquiry Branch, the Royal Commission on Income Tax of 1920 made the following 


comment: 


The citizen who is deficient in public spirit has always aimed at paying less than his fair share of the nation's expenses, and 
it is safe to assume that he will always continue to do so (quoted from Beckett & Sabine, 1987, p. 1). 


The Revenue thus legitimizes their operation by directing their rhetoric towards reducing tax evasion. The disciplinary 
element of the Revenue’s powers and practices of rendering the directors of organizations visible and “coercing through the 
play of the gaze” in order to produce politically docile taxpayers is disguised beneath the folc of protecting the social welfare; 
As Foucault notes disguise of its powers is an integral part of a disciplinary technology. 


power is tolerable only on condition that it masks a substantial part of itself. Its success is proportional to its ability to hide 
its own mechanisms (Foucault, 1980c). 


It must be noted however that such a claim to being protector of the social welfare has some justification. This paper has 
concentrated on how accounting technology is employed as a disciplinary technique, to render organizations governabic. 
However, accounting technology may be a technique of illumination, that is rendering visible malpractice or attempts to 
defraud the citizenry as a whole. 
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In this respect the accountant was intended to 


: “act as an interface between Axis and the Re- 


venue (Burchell et al., 1980). Michael Needham 
had less faith in accountants’ expertise: 


That's why we changed accountant a couple of years ago. 
He was filling in our own income tax retums . . . then I got 
a demand from the Taxman. He said he would write a let- 
ter and make an appeal . . . but he never did, he said he did 
... but he couldn’t have. So I ended up paying. They can 
make 2 mistake but it ends up that you've got to pay. 


Both directors believed that Axis was an inevita- 
ble candidate for investigation by the Revenue 
because of the unconventional nature of its op- 
erations and the high coverage the company 
commanded in the local press. As Steve Jackson 
commented: 


They are used to dealing with corporations which are 
onty anarchic for the purpose of tax dodging. They have 
never met an organization who is anarchic entirely for 


the purposes of being anarchic. So of course they're going 
to be interested in us. 


Although not expressed in terms of visibility and 
surveillance, both directors were “worried” 
(Hopwood, 1985a) about the possibility ofan in- 
vestigation by the Revenue and the conse- 
quences that such an investigation would have 
for them. Even Steve Jackson was concerned that 
the Revenue could “close Axis down”. As noted, 


Michael Needham experienced the presence of 


the Revenue and threat of.investigation in a very 
emotional way. This became more apparent 
when the band was investigated. Steve Jackson 
had a more measured perspective of the 
Revenue but still defined them as being some- 
what capricious and vindictive. Although the di- 
rectors’ impressions of the Revenue varied, the 
general effect was the same, namely, both ac- 
cepted “without debate” the accountant’s re- 
commendation to introduce double-entry books 
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_of accounts in order to satisfy the requirements 


of the Revenue, “whatever they were”. 

By examining the Revenue’s activities from 
the perspective of the individual it is not . 
suggested that the powers and practices of the 
Revenue exist only in the meanings assigned or 
the impressions formed by directors. What is 
suggested however, is that the powers and prac- 
tices do not exist independently of individuals. 
The powers and practices of the Revenue pre- 
exists individuals and are a condition of or con- 
straint upon, their activity. Yet, the entire discip- 
linary apparatus is reproduced, for the most part 
unconsciously and possibly transformed, by in- 
dividuals and would not exist unless they did so 
(Bhaskar, 1979).’ 

The response of the directors to the visibility 
imposed upon them and Axis demonstrates how 
external bodies may influence accounting prac- 
tice in organizations and how conformity to the 
prevailing accepted accounting techniques, in 
this case double-entry books of account; may 
occur. Such conformity reproduces the powers 
and practices of the Revenue by reinforcing its 
sense of presence and highlighting the focus of 
its gaze upon the financial transactions and 
accounting practice of organizations. 


After double-entry 

Although not regarded as a significant event in 
the history of Axis, the introduction of the dou- 
ble-entry system had its effects. The most obvi- 
ous effect of the double-entry system was the 
presence of the bookkeeper each Monday and 
the appearance of five large ledgers in which the 
financial transactions were meticulously 
recorded. 

What was significant was that neither the 
director nor the office staff used the books of ac- 
count. They were referred to as “Susan’s Books” 


7 It is recognized that this interpretivist perspective does not lie easily with Foucault's more structuralist conception of the 
body. My objective is to attempt to portray and articulate what | observed as the directors’ responses, rather than remaining 
faithful to Foucault's position. Foucault is concerned to articulate the nature of disciplinary mechanisms without necessarily 
considering the “effects” of these mechanisms upon the individual. His concern is with the points at which power touches 
the individual and not with the individual’s experience of it. The body for Foucault is to be acted upon and is not itself 
conceived in terms of the actor. A more detailed discussion of this position lies outside the scope of this particular paper. 
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(Susan being the name of the bookkeeper). As 
Steve Jackson observed: 


I just expected my accountants and bookkeepers to do 
what-ever they want to do and I merely kept worrying 
about my cheque stubs and worrying about my informal 
system and knowing that on occasions when things 
seemed peculiar I could go and check back (in his own 
system ). So I knew nothing about it. 


The sales statements were still meticulously 
filed, and the director still drew much of their in- 
formation from them. The chronological listing 
of payments was terminated, however. The six 
monthly band payment calculations were still 
performed by Steve Jackson who bemoaned the 
loss of his listings, but rather than use the books 


of account, he used the cheque stubs as the data . 


source of payments. Showing me his old lists he 
said: 


I used to be able to just run my finger down my list and 
pull out all the outlays for say AX125 or AX136, now I've 
got to use the cheque stubs ... or when I can't find the 
cheque books I have to go to the invoices. I've got no idea 
how Susan files them (the invoices) now. She has her 


own numbering system and it doesn’t have anything to 
do with when they are paid. 


However, the new system did have other less 
obvious consequences. It absolved the direc- 
tors, particularly Steve Jackson, of responsibility 


for “keeping up with the finances”. He com- 


mented that when he kept a record of payments, 
he “used to know a lot more”: 


When I kept my list I used to look at the invoices and it 
was casier to remember what we had paid for and what 
wwe had not. Now I just get a list each week and write a 
cheque out 


In this respect the new system distanced the 


directors and other office staff from both the ac- , 


counting function and from certain events. 
Michael Needham, who was resentful of the 
control that Steve Jackson had over the finances 
in the past, said that he attempted to understand 
the system but that “it did my head in”. He was 
` amused that their simple system, “which at least 
everybody could understand ... even if it wasn’t 
proper” was replaced by five separate books and 


an “incredible number of figures”. The technical 
benefits of double-entry, for example, the 
checks and balances it performs, were not obvi- 


_ous to the directors. 


Although the directors and other office staff 
did not use the books of accounts they did use 
the services of the bookkeeper. When the book- 
keeper was there each Monday, she answered all 
the telephone enquiries concerning the ac- 
counts, particularly from creditors. On the other 
four days of the week, her absence was used as a 
means of delaying or avoiding difficult calls from 
the creditors; they would be asked to call back 
on Monday when the “accountant” was there. In 
addition to recording the transactions, the book- 
keeper would supply Steve Jackson a weekly list 
of the most pressing invoices to be paid. Her jus- 
tification for the document was: 


To get the creditors off my back, its me that’s got to ans- 
wer the phone. It (the list) also shows how much money 
there really is. If Steve sees money in the bank he just HAS 
to spend it. He’s got no idea how many bills there are to 
be paid first. 


Now payment of creditors always occurred ona 
Monday which is an example of how events 
might become structured around the account- 
ing function in organizations. In many respects, 
the list of creditors compensated the directors 
for their loss of intimate knowledge. Instead of 
observing the events through the invoices, a 
summarized account was prepared for them. 
This was the first stage towards a proliferation of 
accounts in the organization. 

The observation that the directors and the of- 
fice staff continued, for the most part, to draw in- 
formation from the original records rather than 
the double-entry bookkeeping system (and in 
fact distanced themselves from the accounting 
function) reinforces the suggestion that the 


-directors themselves would not have intro- 


duced the double-entry system. In this respect, 
the system. was not introduced because of any 
technical benefits that might better serve the di- 
rectors, although the double-entry method of- 
fered considerable benefits to the accountant. 
The above section suggests that the visibility 
imparted upon the directors and the threat of in- 
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vestigation by the Revenue was sufficient in- 
ducement to introduce a record keeping system 
that would be likely to satisfy the Revenue. The 
following section examines an actual Revenue 
investigation of one of Axis’s bands. The purpose 
of the section is two-fold. Firstly, it is intended to 
reveal the disciplinary nature and focus of an in- 
vestigation. Secondly, it reveals another aspect 
of the Revenue’s involvement in the develop- 
ment of accounting in Axis. While Axis itself was 
not directly involved in the investigation, its 
debt to the band (in the form of unpaid re- 
venues) was the central issue of the investiga- 
tion. 


THE INVESTIGATION AND THE 
DEVELOPMENT OF BAND.ACCOUNTS 
\ 
‘ \ 

The Inland Revenue is empoweted to require 
a company or any person involved with that 
company to produce the company’s books of 
account and any other documentation for in- 
spection (Taxes Management Act, 1970, s. 20). 
Furthermore, if the request for information is 
not complied with, representatives of the 
Revenue may apply for a court order and, if 
necessary by force, search the premises and 
seize and remove such books and documents 
which are there. The Revenue therefore, has di- 
rect access to a company’s books of account and 
to the documentation of transactions recorded 
in them. The investigative powers of the Re- 
venue allows for its direct intervention into the 
body corporate from where it may minutely 
examine (investigations may last up to eighteen 
months ) the fine detail of its financial activities. 

The initial review of accounts and returns re- 
ceived by the Revenue classifies them according 
to three categories, referred to as the ERA sys- 
tem. “A” signifies those accounts which are 
“accepted”, “R” refers to those which because of 
their complexity are subjected to.a detailed 
technical examination and “E” refers to those 
companies which will be subjected to an in- 
depth investigation. The precise mechanism by 
which accounts are classified under the ERA sys- 
tem is not however revealed by the Revenue, 
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despite requests from the accounting bodies. 
Reader (1981) notes: 


At the meeting giving rise to TR 358 the CCAB requested . 
that, in the interest of mutual understanding and as an aid 
to practitioners, the Revenue publish the guidance notes 
(on ERA procedures). The Revenue said they would con- 
sider this, but it has subsequently emerged that they are 
unwilling to publish any of the information contained in 
the notes (see Parliamentary Question of 19 November 
1979, p. 15 and Accountancy, January 1980) (p.31). 


The first contact the Revenue had with the band 
was when they were invited to attend an inter- 
view with the Enquiry Branch of the Inland Re- 
venue, which deals primarily with suspected 
fraud and/or where understatements of profits 
are considered to be substantial. Axis’s accoun- 
tant, who was also the band’s accountant, 
Michael Needham and the band itself all at- 
tended the initial interview. 

At the beginning of the interview, the band 
was asked to answer the following written ques- 
tions. : \ 

(a) Have any transactions been omitted from 
or incorrectly recorded in the books of the busi- 
ness (or any other business with which you have 
been connected)? \ 

(b) Are the accounts of the'business (and any 
other businesses with which you have been con- 
nected) correct and complete to the best of your 
knowledge and belief? | 

(c) Are your tax returns and those of the busi- 
ness (or any other business with which you have 
been connected) correct and complete to the 
best of your knowledge and belief? 

(d) Are you prepared to permit an examina- 
tion of the business records together with your 
personal financial records in order that the 
Revenue may be satisfied that your answers to 
the first three questions are correct? 

The band was not required to answer them on 
the spot and they declined to do so. They were 
also requested to sign letters of authority for the 
Revenue to approach third parties for additional 
information (typically banks). The band was 
then advised that any refusal to answer the ques- 
tions, to allow access to books and records or to 
provide letters of authority would be noted as a 
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refusal of co-operation and the documents 
would be seized under section 20 (of the Taxes 
Management Act, 1970). The ensuing interview 
‘covered the nature of the band’s business, the 
manner in which it had been conducted, the de- 


tails of records kept and accounting system. 


maintained. Because of the inadequacy of their 
records, the band was unable to satisfactorily 
answer the investigating officer’s questions who, 
at the end of the interview, advised them to 
authorize their accountant, at their expense, to 
compile a report on the situation. The report 
was to include a description of the bookkeeping 
and accounting system employed during the 
period under review, the person responsible for 
their maintenance and comments on the 
strengths and weaknesses of the system and 
methods. 

. These requirements amply reflect the focus or 
the Revenue’s gaze upon the bookkeeping and 
accounting practices of organizations. The re- 
quirements also reflect a number of interesting 
disciplinary techniques. Although the effect of 
an investigation is to directly expose a com- 
pany’s or, in the case of the band, a partnership’s 
accounting process to the Revenue’s gaze the 
books need not be scrutinized by the Revenue it- 
self. The emphasis is upon the subject to confess 
and disclose the full extent of their yet un- 
specified offence. The subject becomes the 
bearer of the disciplinary techniques. Effectively 
the band members were requested to investi- 
gate themselves. The dilemma they faced was 
whether to make a full disclosure and risk uncov- 
ering issues as yet unsuspected by the Revenue 
or refuse to co-operate and risk criminal pro- 
secution if the Revenue found sufficient 
grounds. The dilemma was exacerbated because 
neither the band nor the accountant had confi- 
dence in the accuracy or completeness of their 
bookkeeping records or their previously submit- 
ted ‘annual returns. The confession, therefore, 
plays an important role in the Revenue’s discipli- 
nary practices. ; 

Moreover, the strategy to advise accountants 
to conduct global evaluations of their clients’ 
accounting systems casts them, with or without 
their connivance, as agents of the Revenue. They 
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serve the Revenue’s interests by revealing flaws 
and inaccuracies in companies’ accounting 
systems, possibly beyond these which the Re- 
venue is suspicious of. In this sense the accoun- 
tant becomes part of the Revenue’s disciplinary 
technology. 

In addition to becoming part of the Revenue’s 
apparatus, the accountant is also constituted as a 
site for the exercise of power. At the interview, 
the Revenue required a report on the work done 
by the reporting accountant on the books and 
records and the annual returns. During the in- 
vestigation, the activities of the accountant were 
therefore brought within the Revenue’s gaze; 
the accountant became accountable to the 
Revenue for his involvement in the accounting 
process of the band. The accountant’s accounta- 
bility was reinforced in that he could have been 
prosecuted under Taxes Management Act, 1970, 
s.99 and s: 20a if the Revenue judged that he 
knowingly assisted in the “making or delivery for 
any purpose of tax any incorrect returns or 
accounts”. This observation reveals another 
possible interweave between accounting and 
Revenue practice, although a different data set 
would be required to study this thoroughly. 


BAND ACCOUNTS 


The accountant’s investigation of the band’s 


‘records revealed that they had been maintained 


on a cash basis. As Axis made payments on re- 
cord sales six months in arrears, a considerable 
amount of revenue normally accrued to the 
band at the end of their financial year. Moreover, 
the band, who had been with Axis from the be- 
ginning, had rarely received full payment of 
revenues. For the most part they had left a por- 
tion of money in Axis to help promote other 
musical projects. In effect, the band had grossly 
understated their revenues over the years and 
thus their taxable income. What neither Axis nor 
the band realized was that the revenues earned 
by the_band_were-subject-to-taxatieon-whether 
they received them or not. The accountant had 
difficulty explaining accrual accounting to the 
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band and the directors of Axis. As Steve Jackson 
commented: f 


“The whole point is that I believe that any money you 


had, you should pay tax on. But its a bit fucking shitty to ° 


pay tax on money that you’ve never had. That’s what’s un- 
fair about going from cash accounting to accural account- 
ing.” 


It transpired that it was this feature of the band’s 
accounting system which was (or became) the 
focus of the Revenue’s investigation. The band 
was instructed that their taxation liability must 
be based upon accural accounting and they were 
sent an estimate of outstanding tax owing to the 
Revenue with penalties. The accountant advised 
that the band and, therefore Axis, would have to 
pay the full amount or provide evidence of the 
band’s earnings from each record released to 
convince the Revenue that the sum demanded 
- was inappropriate. Steve Jackson rationalized 
this situation in the following manner: 


Michael has been panicked by this whole tax thing. He 
doesn’t have the confidence I have in these cases. He 
doesn’t really understand that when the Taxman sends 
you a letter saying “I want X pounds”. The taxman doesn’t 
actually want X he wants you to do your accounts. 


The above comments also reflect the anxiety 
that the Revenue investigation of the band had 
upon Michael Needham, who at this point would 
not talk about it with me or anybody else. 
Although Steve Jackson calculated the band’s 
share of earnings every six months, no for- 


malized account with details of the revenues and ` 


expense was constructed. Axis therefore began 
the process of retrospectively constructing 
“band accounts” for each record released since 
the formation of the company. The task was huge 
and was a source of discontent in the office for 
months, The scale of the task was due to the 
method of calculating payments to bands. As 
Steve Jackson commented: 


Our system has the wilful eccentricity of the 50/50 split 
which makes life very difficult. You wouldn't have to do 
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The system required that individual product 
profit statements, itemizing costs and revenues 
for each record be prepared. Remuneration 
based upon royalties or a percentage of sales 
which is normal in the music business, even 
among the other independents, would have 
entailed considerably less work. Steve Jackson 
subsequently proposed that this system be intro- 
duced, thus reflecting another change in the ac- 
counting system. 

Axis had attempted to produce band accounts 
in the past. A few bands were pressing Axis to 
provide greater details of their earnings or lack 
of earnings. In addition, the calculations of music 
publishing royalties, which were required by 
law, entailed more information than Axis was 
producing. Finally, overseas’ sales and the prop- 
osed introduction of cassettes and compact 
disks were complicating the cost structure of 
Axis. Yet, past attempts to produce band. 
accounts had always stalled when other more 
pressing issues arose. It was the Revenue which 
provided the rallying cry around which the pro- 
ject was finally undertaken and completed. 

The band accounts however, were not simply 
to reveal the extent of the band’s unrecorded 
revenues but were also foreseen as necessary for 
the very real possibility that Axis itself would be 
investigated as a result of the band’s investiga- 
tion. The directors were convinced of this possi- 
bility. Steve Jackson commented: 


So y'know they see millions of pounds swimming around 
in this bizarre network (Axis and the bands)... and they 
think “this is obviously corruption” (speaking from the 
Revenue's perspective) “no one’s telling me there’s all 
these millions of pounds and no one’s getting rich.” And 
if I was them, with their experience of life and business I 
wouldn't believe it either. 


Once again the introduction of a new form of ac- 
counting in Axis was because of the possibility of 
investigation rather than its actuality. However, 
at this point, the directors’ knowledge of the Re- 
venue was greater. 


% of Axis's accounting if we had a normal record com-—— As with the introduction of the double-entry, 


pany accounting system. We keep 50/50 entirely for ec- 
centric reasons. Its the old idea of half to you and half to 
you .,. that’s the only reason we have it. Its a tradition 
that we have and it makes life very difficult. 


the band accounts had their effects. After the 
band accounts had been developed, Steve 


Jackson referred to the new “financial realism” of 
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the company. An example of this new realism is 
in evidence at a meeting with the band’s mana- 
gers: 


Steve Jackson: “You know that your last album and single 
in Britain made a loss.” 


The manager: “I dispute that.” 


Steve Jackson: (interjecting). “You can’t dispute the 
figures, the money in and the money out. The money out 
exceeds the money in by about sixty grand. In very sim- 
ple, straightforward, objective terms.” 


The manager: “You've got the figures then? Show me.” 


Steve Jackson: “Yes, I'll give you the figures, so easy. The 
figures are (referring to the band acounts) costs ... re- 
cording and mixing 60, video 40, the printers bill 110... 
that’s where we really pot caught. Pressing was 65 with 
royalties the total cost are 315 grand. Total money in was 
about 260 grand the money out was 315. We made a loss 
of about 55 grand on the first two months sales, which is 
the main thing.” 


While the new financial realism could not be 
entirely attributed to the introduction of band 
accounts, they certainly facilitated the process; 
the above conversation would not have been 
possible without them. 

At the end of the research period Steve 
Jackson answered my enquiries about the posi- 
tion they were in: 


No it hasn't been resolved yet, but we're still alive and 
kicking despite our accountant, the supposed expert two 
years ago telling us to pay X pounds, and nineteen 
months ago telling us to go bankrupt ... “it was the only 
way out” . . . advising us seriously of that, and us not doing 
that. We're still here ... we're still fine. Now suddenly 
there’s more money around, our turnover’s increased 
again, suddenly we're a lot healthier. So now it’s not that 
terrible, but it would have been ifwe had to pay two years 


ago. 
The introduction of double-entry and the de- 
velopment of band accounts in Axis reveals that 
the organizational (and indeed historical) 
origins of particular forms of accounting 
technology cannot be purely “understood in 
terms of the needs and requirements of specific 
organizations in which it functions” (Burchell et 
al., 1980; Hopwood, 1986). Rather, their origins 
may, in part, be due to the influences of external 
bodies. In the case of Axis, the Revenue, inten- 
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tionally or otherwise, is seen to be fundamen- 
tally involved in the introduction and develop- 
ment of their.accounting practices. The manner 
of the Revenue’s involvement may not be re- 
duced to the mechanistic application of statut- 
ory requirements. Rather, its operations are 
characterized by the exercise of a disciplinary 
apparatus which has the effect of rendering the 
bookkeeping and accounting practice of organi- 
zations visible and subjecting those individuals 
responsible for its upkeep, to a sense of continu- 
ous surveillance such that they become con- 
vinced that errors or infractions will be detected 
and punished. In this respect, the subjects of the ` 
Revenue’s disciplinary power become its bear- 
ers. 
Accounting, however, is not only the focus of 
the Revenue’s interest, it may also be an integral 
part of its technology. The Revenue is seen to 
use accounting in order to render visible the fi- 
nancial transactions of organizations. As Hop- 
wood (1986) notes: 


The accounting eye provided 2 means for penetrating 
into the inner workings of the organization, constructing 
a strategic visibility of the economic. 


Therefore, accounting may be said to be in- 


‘volved in the creation of particular forms of 


economic visibility in order to create a transpa- 
rent organization. Moreover, the Revenue’s in- 
terest may extend beyond the observation of in- 
dividual organizations to include practices in- 
tended to monitor and even regulate the wider 
organizational domain. The final section 
explores these issues. 


BEYOND AXIS 


As noted earlier, the Revenue maintains 
individual taxpayer’s files on each registered or- 
ganization. The creation of individual files con- 
stitutes a form of knoweldge, not a general body 
of knowledge which has as its object a particular 
category of subjects but rather, knowledge of a 
singular entity within the category. The accumu- 
lation of knowledge on individual cases how- 
ever, provides a data base from which general 
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propositions may be formed and tested. Knowl- 
edge from individual dossiers may be combined, 
aggregated, classified, correlated, statistically 
tested and formed into a general body. In turn, 
specific cases may be compared with or inter- 
preted within the general body.. A number of im- 
plications may arise out of this practice. 


Calculative norms 

The compilation and collation of individual 
dossiers permits the creation of calculative 
norms. Dodd (1983) notes the Revenue will 


build up an accounting profile or model of each . 


company to act as a bench-mark for comparison 
with the returns and accounts submitted by like 
companies. Such accounting profiles, which in- 
clude “significant ratios” derived from the In- 
come Statement and Balance Sheet, are included 
in “district profiles” and distributed to local of- 
fices in the form of “Business Notes” (Reader, 
1981, p.27). Dodd (1983) notes the tax inspec- 
tor will use such profiles to: 


satisfy himself that the apparent gross returns make econ- 
omic sense in relation to the scale of the business... and 


generally expected trade margins of the business (Dodd, 
1983, p.9). ` 


In effect, a calculus or arithmetic of organiza- 
tional activities and performance is constructed. 
Much of this arithmetic is expressed in financial 
measures derived from accounting. Increas- 
ingly, Revenue investigations utilize accounting 
techniques such as ratio analysis to examine the 
submitted accounts and returns. The use of 
these techniques to evaluate and compare com- 
pany activities suggests that the financial ac- 
counting process forms part of the Revenue’s 
technology and is not just the focus of its gaze. It 
is an example of how accounting is used to 
create “a particular regime of economic calcula- 
tion” (Hopwood, 1987b), not only within or- 
ganizations, but also “of” or “around” them. As 
Hopwood (1987b) goes on to note, such re- 
gimes may be employed “to make real and pow- 
erful quite particular conceptions of economic 
and social needs” (p.213). 

For example, such data is likely to be used in 
determining government taxation policy, possi- 
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bly in the regulation of particular sectors of the 


economy and in planning for national income 
and expenditure. The full implications of this 
documentary apparatus are outside the scope of 
this paper: however, it is suggested here as a 
worthy topic for future research. 

The power effects of the dossier are further 
enhanced by prohibiting access to it by the 
named organization. An organization is there- 
fore unaware of the contents of its dossier or 
which institutional body maintains or has access 
to it. The accumulation of knowledge on specific 
cases and the formulation of a body of general 
knowledge takes place in secrecy, expressly 
excluding those'to whom it relates (Wheeler, 
1969). 

Foucault notes that the written document 
proved to be a major advance in the develop- 
ment of disciplinary power (Foucault, 1977a). 
Indeed, the documentary apparatus becomes a 
critical component of the disciplinary technol- 
ogy as it “makes possible the measurement of 
overall phenomena, the description of groups, 
the characterization of collective facts” 
(Foucault, 19772) and, hence, I would suggest 
the development of calculative norms as tools in 
the management of the organizational domain 
(Hoskins & Macve, 1986; Hopwood, 1987b). 
The documentary apparatus and the specific 
knowledge of individual cases contained within 
are the base upon which the Revenue may form 
and refine its categories, its administrative place- 
ments, its calculative norms and hence the effec- 
tiveness of its disciplinary technology. 


Normalizing judgment 
The documentation of organizations and the 


aggregation of this knowledge may have unin- 
tended consequences for the performance of or- 
ganizations. Reader (1981) notes that “an appa- 
rently low rate of gross profit or poor perform- 
ance by reference to similar local businesses or 
national averages” may result in the inspector 
selecting the company for investigation. Con- 
ceivably, the application of this arithmetic may 
have a normalizing effect on organizations (Hop- 
wood, 1986, 1987b) by coercing them to oper- 
ate within “normal” financial parameters. As evi- 
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dence of this, a column of Tbe Times dated 25. 


April 1981, entitled “Why it Pays to Tell The 
Truth to the Taxman” gave the following advice: 


Make sure that you and your accountant are both looking 
at the business in the same way as the Inland Revenue. If 
you are earning less from your trade than would nor- 
mally be expected, are there good convincing reasons for 
this? (Emphasis added.) 


Further evidence may be found in CCAB Techni- 
cal Release 309 which states the following: 


The CCAB referred to a few cases which had been 
brought to their attention where the inspector proposed 
to adjust accounts submitted by reference to what he 
considered sbould be the rate of gross profit earned, 


without any other grounds being advanced for not ac-. 


cepting the accounts (para 17). (Emphasis added.) 


While the above does not pretend to be a com- 
plete articulation of the Revenue’s use of data 
derived from annual returns and accounts, the 
utilization of specific accounting techniques is 
an example of what Hopwood (1987b) refers to 
in the following quote: 


Accounting is seen as having played a very positive role in 
the creation of a manageable organizational domain ... 
and an important means by which organization Is incor- 
porated into the social domain (p.213). 


CONCLUSION 


The dominant image of accounting within the 
accounting literature is that ofa purely technical 
phenomena, As Hopwood (1987b) notes: “Em- 
phasis has been placed upon the accounts that 
organizations need and the technical accounting 
configurations that they must have” and with the 
“articulation of a quite homogeneous technical 
domain”. As such the roles of accounting in or- 
ganizations and society largely have been 
ignored. There is, however, a growing recogni- 
tion that such a view is deficient. Increasingly, al- 
though still with a limited voice, accounting is 
seen to intersect or intertwine with the organi- 
zational and the social in a multitude of ways re- 
sulting in a variety of new and challenging im- 
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ages of the increasingly central, creative and 
constitutive role of accounting. (Burchell et al., 
1980, 1985; Cooper, 1981; Tinker et al., 1982; 
Hopwood, 1983.) 

At the organizational level accounting may be 
seen to give rise to or reinforce particular or- 
ganizational forms or conceptions of organiza- 
tional order and culture. Accounting may impact 
upon the language of the organization (Hayes, 
1983), the sense of organizational time (Clark, 
1982; Hopwood, 1986), the meanings and sig- 
nificance that particular events and actions have 
for individual participants and may be seen to 
enhance organizational legitimacy (Meyer, 
1986). 

Some of these processes are evidenced by the 
example of Axis. At the very least, a new financial 
realism was facilitated by the band accounts. The 
new bookkeeping system had the effect of or- 
ganizing certain events around the accounting 
function and the bookkeeper. The directors’ lan- 
guage expanded to include accural accounting 
although the concept was never clearly under- 
stood. Certainly, after the investigation of the 
band, the directors at Axis were far more con- 
scious of the accounting implications of their ac: 
tions and Steve Jackson began to use the band ac- 
counts to justify the expenditure he was pre- 
pared to make on particular records. 

At the social level accounting is seen to be in- 
volved in particular social and political activities 
(Cooper, 1981; Tinker et al., 1982; Hopper et 
al., 1986; Hopwood, 1987b, to name but a few) 
designed to render visible and, therefore, man- 
ageable, the activities of individual organizations 
and the wider organizational domain. Calcula- 
tions based upon accounting technology are the 
basis of government taxation and are a means of 
promoting and implementing economic policies 
of the state. For example, accounting knowledge 
may be used to promote economic stabilization, 
the regulation of particular sectors of industry 
and in prices and wage control (Burchell et al., 
1980). 

Significant to this paper, the study of account- 
ing from the social and organizational perspect- 
ive reveals that many forms of accounting have 
external origins, despite the conventionally 
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given internal rationales. The organizational 

` origins of particular forms of management and fi- 
nancial accounting may reflect government reg- 
ulatory provisions, the demands of financial in- 
stitutions such as banks and insurance com- 
panies and, in the case of this paper, from the 
implicit or explicit injunctions from institutions 
within the judicial state apparatus (namely the 
Inland Revenue). 

The concern of this paper was not just to iden- 
tify the Revenue’s possibie role in influencing 
accounting and bookkeeping practice, but to 
explore the mechanisms which lead to individu- 
als and organizations taking up the explicit and 
implicit injunctions of the Revenue. Therefore, 
it is concerned with connections, connections 
between the Revenue and organizations, con- 
nections between the Revenue and the indi- 
vidual and finally connections between account- 
ing and Revenue practice. The data, as pre- 


sented, is not intended to be exhaustive, it is - 


used to illustrate the connections. The ideas pre- 
sented do not stand upon the weight of evidence 
but rather their intelligibility which, as Cousins 
& Hussain (1984) note, is in keeping with the 
style of Foucault. 

Drawing on the work of Foucault, it is 
suggested that the Revenue operates a discipli- 
nary technology applied both at the level of the 
company and at the level of the directors. This 
technology, which not only includes legislation 
but extends beyond, is based upon the principle 
of visibility and surveillance. The techniques 
employed include registration, categorization, 
administrative placement, the compilation of 
dossiers and the threat of investigation. Such 
techniques render visible the activities of the 
directors, the organization and the bookkeeping 
and accounting process, emphasizing the impor- 
tance of keeping them in a proper manner or in 
accordance with conventional accounting tech- 
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niques. Moreover, the Revenue is seen to be in- 
volved, along with other statutory bodies, in the 
development of a documentary apparatus con- 
cerned with permanently recording the finan- 
cial activities of organizations. Such an apparatus 
facilitates the construction ofa calculus or arith- | 
metic of organizational activities against which 
to compare and judge the performance of indi- 
vidual companies. Such a calculus may also be 
utilized by the government to monitor and regu- 
late the activities of individual companies, cate- 
gories of organizations or the entire economic 
domain. 

The application of the above disciplinary tech- 
niques reveals connections or points of inter- 
section between the Revenue and accounting 
practice. Accounting practice is seen to be both 
the focus of the Revenue’s gaze and a facilitative 
technology which renders the financial ac- 
tivities and accounting practice of individual 
organizations visible. The intersection however, 
may not be reduced to some mechanistic imag- 
ery. In the case of disciplinary power, cause— 
effect or intention and outcome are not clear 
cut. The relationships cannot be encapsulated in 
rigid cause—effect representations. In this re- 
‘spect the application of the Revenue’s powers 
and practices may manifest itself in a variety of 
ways, many of which may be unintended. For ex- 
ample, the normalization of business perform- 
ance is possibly an unintended consequence of 
the Revenue’s arithmetic. Moreover, particular 
organizations may respond in a different manner 
to Axis. This may be particularly true of large 
companies employing qualified accountants and 
sophisticated accounting systems. However, this 
case study suggests that the Revenue may 
nevertheless exert an important influence upon 
a company’s adoption and development of par- 
ticular forms of accounting practice. 
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Abstract 


This article explores the relationship between the corporatist structure of public accounting regulation and 
the internal social order of the profession in Ontario, Canada. The social control of the profession, analyzed 
in terms of Gramsci’s theory of hegemony (“moral and intellectual leadership” ), is shown to be required 
and facilitated by corporatist structures. The study contributes to the theorization of accounting regulation 
within the nexus of state, market and community forces. 


Accounting associations under advanced 
capitalism are increasingly intertwined with the 
State in the regulation of economic activity. 
They serve as an organizational means of ag- 
gregating, defining and communicating the in- 
terests of practitioners while at the same time 
serving to’ mobilize and control practitioners 
through the implementation of licencing and 
` disclosure standards. The juxtaposition of these 


two roles has been referred to in the political sci- - 


ence literature as “corporatism”. In the account- 
ing literature, several recent studies (e.g. 


Willmott, 1984; Puxty et al,"1987) have drawn‘ 


upon Ccorporatist concepts to understand and 
theorize the regulation of accountancy. These 
studies seek to explore the relation between ac- 
countancy and the context in which it is prac- 
ticed in order to better understand the potential 
and limitations of accounting. 

This paper focuses on the consequences of 
corporatist forms of public accounting regula- 
tion for the internal social order of the profes- 
sion. Specifically, the paper is concerned with 
the effects of corporatist forms_of regulation on 
the structure of leadership in the profession, the 
forms of control applied to practice, and on the 





way in which dissent within the profession is 
managed. The paper is composed of two halves. 
In the first half, the concept of corporatism is in- 
troduced and a critical review of its use in the ac- 
counting literature is presented. Based on this 
review, an alternate approach to the phenome- 
non, drawing on Panitch’s (1980, 1981) defini- 
tion of the concept, is suggested. This approach 
is coupled with Gramsci’s (1971) theory of 
hegemony as a means of understanding the pro- 
cesses by which consent to corporatist political 
structures is maintained. 

The second half of the paper presents an em- 
pitical study of the regulation of accountancy in 
Ontario, Canada. This study illustrates the rela- 
tionship between the internal social order of the 
profession and its involvement in corporatist 
structures in one particular jurisdiction. The 
phenomena documented provided additional 
evidence on which the theorization of account- 
ing regulation may be based. . 


CORPORATIST CONTROL 


The concept of corporatism is at the centre of 


*The author would like to acknowledge the assistance of Mike Wright in performing reliability checks on the content analysis 


“used in this paper. An earlier draft of this paper was presented to the 1985 


ueen’s University Behavioral and Social 


Accounting Conference and benefited from the comments of participants, particularly Norm Macintosh, Stan Davis and ee ay 
Lowe. The paper kag also benefited from the comments of David Cooper andthe journal's reviewers. 
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a “growth industry” (Panitch, 1980) in the social 
sciences which attempts to understand the de- 
velopment of a class of political structures in ad- 
vanced capitalism. At the most fundamental 
level, corporatism is defined as a form of interest 
intermediation and a mode of state policy im- 
plementation. The focus on interest intermedia- 
tion draws attention to the organizations, such as 
accounting associations, which serve as the 
medium through which specific functional in- 
terests are defined and aggregated. The focus on 
state policy implementation (as process not out- 
come) draws attention to a particular medium of 
control, such as standard-setting or licencing, 
through which authority is delegated and con- 
flicts localized. The conceptual strength of cor- 
poratism is its focus on the articulation of these 


two phenomenon as they are manifest in actual . 


political structures (e.g. Jessop, 1979, p. 190; 
Offe, 1985, pp. 241—247). It is this strength 
which underlies its use here as a way of describ- 
ing the relationship between the accounting 
profession and the state in the regulation of the 
practice of accountancy. 

The literature has, in the main, treated cor- 
poratism as an-ideal type. It is intended as an 
analytical device and not a description of any ac- 
tual political structure. For example, Schmitter 
(1974) defines corporatism as: 


A system of interest representation in which constituent 
units are organized into a limited number of singular, 
compulsory, non-competitive, hierarchically ordered 
and functionally differentiated categories, recognized or 
licensed (if not created) by the state and granted a delib- 


erate representational monopoly within their respective 
categories in exchange for observing certain controls on 
the selections of leaders and articulations of demands and 


supports. 


This definition has been extended to create 
typologies of corporatism based on variations in 
specific attributes of Schmitter’s ideal type. A 
distinction has been made, for example, be- 
tween state corporatism, where the state has 
created a representative body, and societal cor- 
poratism, where corporate bodies emerge or- 
ganically within civil society. There have also 
been distinctions based on the level within soci- 
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ety at which corporatist arrangements are estab- 
lished. Much of the initial analysis of the 
phenomenon céntered on “peak” organizations, 
corporatist arrangements which allow national 
economic planning and-control by coordinating 
labour and producer's organizations. This has 
been referred to as “macro-corporatism”. Sub- 
sequently, the case has been made for the impor- 
tance of corporatist arrangements within 
specific sectors of the economy, the “meso-. 
level” (Cawson, 1985), and even for the rele- `, 
vance of the concept in explaining the involve- 

ment of individual firms in economic planning, 

“the micro-level” (cf. Cawson, 1986). While the 

use of ideal types may serve to draw attention to 

variations in the phenomena, this methodology 

may also mask aspects of the phenomena which 

should be theorized. 

The ideal type methodology tends to treat 
corporatism as a stable social structure with pre- 
specified functional qualities. At the extreme, 
corporatism has been ideologized, for example 
in Catholic social theory, as a means of re-estab- 
lishing the organic unity of society by aggregat- 
ing labour and capital within one planning unit. , 
The existence of non-equilibrium forms or of 
“unintended” consequences of corporatism are 
not theorized within the approach and empiri- 
cally such phenomena are marginalized as ran- 
dom fluctuations from an ideal. An alternate ap- 
proach, reflected particularly in the work of 
Panitch (1979, 1980, 1981), treats corporatism 
as an actual political structure, the stability and 
social consequences of which are subject to em- 
pirical investigation rather than theoretical 
specification. To date, the use of corporatist 
theories to explore the regulation of accounting 
have used ideal-typical analyses. In the following 
section, the use of the concept of corporatism as 
an ideal type in the accounting literature is 
critiqued as a prelude to an empirical study 
based on the descriptive approach suggested by 
Panitch. ` 


Corporatism and accounting regulation 

The use of the concept of corporatism in the 
accounting literature has drawn heavily on the 
work of Streek & Schmitter (1985). Streek & 
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‘Schmitter have continued the ideal-type analysis 
of corporatism, arguing that associations em- 
body and constitute a particular mode of social 
order equal in status to state, market and com- 
munity principles of social order. This neo-cor- 
poratist construction has been applied to the 
analysis of accounting regulation by Willmott 
(1984, 1985) and, with some critical elabora- 
tion, by Puxty et al (1987). Puxty et al, while 
acknowledging the identification of principles of 
social order by Streek & Schmitter as useful, re- 
ject corporatism (the associative principle) as a 
mode of social order. They argue instead that ac- 
counting regulation may be seen as the histori- 
cally and culturally specific nexus of state, mar- 
ket and community modes of social order. 

` Puxty et al, attempt to advance the Streek & 
Schmitter framework by placing it in the context 
of advanced capitalism and focusing on the con- 
tradictions within and among modes of social 
order. They conceptualize these modes of social 
order as “formally incompatible, -yet substan- 
tively interdependent”, arguing that the ability 
of advanced capitalism to displace and contain 
structural contradictions lies in the intersection 
of these modes of social order. The empirical 
work reported by Puxty et al. succeeds in de- 
monstrating that differences exist in accounting 
regulation across four countries in roughly the 
` same stage of economic development. Their 
analysis, however, stops short of demonstrating 
how the contradictions inherent in the interac- 
tion of various modes of social order are man- 
aged within the context of specific national his- 
tories and institutions. : 

The approach taken by Puxty etal. constitutes 
an advance over many versions of corporatism 
(including the work of Streek & Schmitter) 
which grants corporatism the status of distinct 
political systems or modes of social order (cf. 
Willmott, 1984; Panitch, 1980). At the same 
time, however, the reduction of corporatism to 
the admixture of ideal types of social order fails 
to theorize why this particular political struc- 
ture has developed in some contexts, the roles it 
plays, or its consequences (cf. Panitch, 1980, p. 
161). In this approach, corporatist arrangements 
(or other forms of regulation) become part of 
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the current economic, political and social sys- 
tems rather than being articulated with them. In 
this way, the opportunity to theorize the con- 
nection between actual political structures and 
the context in which they develop is lost. 

The use of ideal types of social order as an 
analytical tool also.tends to gloss over issues 
within each category. For example, the principle 
of “spontaneous solidarity” which underlies the 
“community” mode of social order presumes 
the existence of homogeneous interests. The 
creation of a “community” within the profes- 
sion, however, is problematic and not indepen- 
dent of state and market forces. Willmott 
(1985), for example, notes that the English ac- 
counting profession, through the formation of 
the Consultative Committee of Accounting’ 
Bodies (CCAB), has achieved the ability to 
“speak with one voice”. His review of the history 
of the profession suggests that the emergence of 
the CCAB was a defensive reaction to pressure 
from the state and various interest groups. Al- 
though the CCAB may speak on behalf of the pro-` 
fession, Willmott’s work suggests that it may not 
represent all interests within the profession. The 
internal reorganization of the profession to form 
the CCAB and the relation of this organization to 
state and market forces does not find theoretical 
explanation in Puxty et al’s framework. In addi- 
tion, the focus on “community” leaves no role 
for conflict or for those outside the bounds of 
the “community”. Are “outsiders” to be 
theorized as deviants or perhaps as “subcul- 
tures” (cf. Hebdige, 1979)? 


The legitimacy of 

The essence of Panitch’s (1981) critique of 
the ideal-type approach to corporatism is his 
concern over the way in which dissent is con- 
tained and consensus manufactured within 
these political structures. It has been noted that 
corporatist arrangements are remarkably dura- 
ble, often continuing through changes in econ- 
omic conditions and governments. The stability 
of corporatism, however, does not appear to be 
an aspect of their design. Schmitter (1985, p. 37) 
has argued that modern corporatist arrange- 
ments have emerged as second-best solutions to 
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political crises. They exist below the surface of 
public discourse, crucially involved in public 
policy formation and implementation, and yet 
not directly accountable to the public. The 
result is that corporatist arrangements have a 
“precarious legitimacy” (Schmitter, 1985, p. 37) 
or, as Marin (1985, p. 90) has powerfully stated, 
“a continual moral ambivalence, a lack of politi- 
cal legitimacy, an absence of normative founda- 
tion”. The survival of corporatist arrangements 
depends upon the ability of the state or the cor- 
porate body to resolve/displace persistent chal- 
lenges to the legitimacy of the relationship. 

A challenge to the: legimacy of corporatist- 
bodies may come from various sources. For 
example, third parties affected by policies im- 
plemented by corporate bodies may challenge 
the legitimacy of the corporations’ right to act. 
Corporatism shares this weakness with all ad- 
ministrative bodies granted executive powers 
within representative democracies. Freedman 
` (1978), for example, has documented recurrent 


challenges to the legitimacy of administrative 


bodies in the United State. These challenges typ- 
ically centre on one or more of three charac- 
teristics: the lack of separation of legislative and 
executive powers, both creating and enforcing 
regulations; decision-making processes which 
depart from judicial norms, particularly the lack 
of representation of minority interests and the 
use of in-camera procedures; and incumbents in 
these regulatory bodies who have no direct 
political accountability. These external chal- 
lenges focus on the role of these bodies in the 
implementation of state policy. 

. More importantly, from the perspective of this 
study, corporatism also faces a challenge to its 
legitimacy due to its claim to a representational 


monopoly within a particular functional area- 


(Grant, 1985, p. 28; Jessop, 1979, p. 201). Where 
this monopoly is a matter of degree rather than 


- absolute, excluded interests may challenge both. 


_ the interest intermediation and policy im- 

plementation functions of the corporate body. 
Grant (1985, p. 27) has observed in this context 
that, “corporatism has been a consumer, not a 
producer, of legitimacy with the state as its 
supplier”. | ; 
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The stability of corporatism in spite of its 
widely recognized legitimacy deficit requires 
explicit theorization. The Puxty et al frame- 
work recognizes this issue and suggests that the 
answer lies in the shifting dominance of various 
principles of social order, the application of a 
particular principle serving to displace crises 
thrown up by the application of other principles. 
The shift among principles is argued (p. 282) to 
be the “(often unintended) consequences of 
these parties’ efforts to mobilize their stock of 
material and ideological resources . . . to 
safeguard or advance their own individualistic 
career interests as well as the class interests of 
those on whose behalf they act”. This construc- 
tion leaves the outcome indeterminate and 
suggests a shift in the nature of regulatory struc- 
tures to accommodate recurring crises. The per- 
sistence of corporatism suggests that the negoti- 
ations among parties are in fact more con- 
strained that this approach suggests. 


Corporatism as a political structure. 

The alternate approach to the analysis of ac- 
counting regulation explored in this paper takes - 
as its starting point Panitch’s (1980, p. 173) min- 
imal definition of corporatism as: 


A political structure within advanced capitalism which 
integrates organized socioeconomic producer groups 
through a system of representation and cooperative 
mutual integration at the leadership level and mobiliza- 
tion and social control at the mass level. 


Panitch emphasizes that this definition is 
specific and partial. It is specific in focusing on 


‘the relationship between the state and func- 


tional groups in the policy-making process, 
stressing the process of integration among pro- 
ducer groups and identifying the concession of 
autonomy by these groups as they become, in 
part, agencies for the administration of state pol- 
icy. It is partial in that corporatist structures are 
defined as adjuncts to and interwoven with 
existing political and economic systems; the de- 
finition does not equate these structures with 
the systems in which they operate nor does it at- 
tempt to suggest that they operate outside of the 
logic of these systems. 
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This definition is descriptive. Panitch argues 
that an advantage of this approach is that it is 
consistent with a class-theoretic historical 
materialist framework which has informed many 
theorists (including Puxty et al.) substantive 
analyses. It does not assume the desirability or 
stability of corporatism and encourages research 
‘focused on the substantive character of cor- 
poratist structures rather than their formal as- 


pects (Panitch, 1980, p. 181). The exploration of, 


these structures proceeds from an understand- 


ing of the structure of market and political sys-. 
- ticular political structures, the involvement of 


tems under advanced capitalism. 
nS 


Corporatism and tntraprofessional hegemony 
As a political structure, corporatism is inhe- 
rently unstable. In terms of Panitch’s definition 
the fundamental contradiction stems from cor- 
poratism’s role in representing the interests of a 
particular socioeconomic producers group and 
its role in controlling this group as an agent of 
state policy implementation. The corporatist 
role assumes that these policies will be reliably 
implemented by individuals and that individuals’ 
demands on the state will be kept within “rea- 
sonable” limits. This expectation may hold 
where such groups are homogeneous “com- 
munities” and the policies being implemented 
provide a “material basis for consent” 
(Przeworski, 1980). The accounting profession, 
however, is not a homogeneous body of prac- 
titioners, rather it is segmented on a number of 
bases including sources of income (Montagna, 
1974, pp. 159-160), values (Rosenberg et al, 
1982), tasks (Montagna, 1974, pp. 175-179), 
and by the existence of distinct professional as- 
sociations (Johnson & Caygill, 1971). The 
demands of various segments and their reaction 
to state policy is problematic. The corporatist 
role requires the managed consent of diverse 
constituencies within the profession. 

In order to understand the development and 
maintenance of consent to, and hence the legiti- 
macy of, corporatism in the profession I have 
turned to Gramsci’s (1971) theory of “intellec- 
tual and moral leadership” or hegemony. The 
theory of hegemony is a mid-range theory ap- 
propriate to an understanding of the processes 
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which maintain hierarchal political structures. 
These structures are a reflection of the econ- 
omic and social systems in which they are em- 
bedded. The theory of hegemony does not at- 


‘tempt to explain the origins of these structures 


but focuses on the means by which they are 
maintained or, through the same means, 
changed. In particular, Gramsci sought to under- 
stand why systematically disadvantaged groups 
would voluntarily support such structures. Al- 
though the concept of hegemony draws atten- 
tion to the consent of subordinate groups to par- 


the state implies that the potential for coercion 
is always in the background. Hegemonic states 
thus represent a balance of coercion and con- 
sent but one in which consent by the majority is 
clearly present. Empirically, those using the con- 
cept of hegemony have identified two variants: 
the hegemony of consent, and the hegemony of 
coercion (cf. Hall et al, 1978; Lehman & Tinker, 
1987). 

The hegemony of consent requires that or- 
ganized opposition be “won over, neutralized, 
incorporated, defeated or contained” (Hall et 
al, 1978, p. 319). In this state a working com- 
promise is reached under which subordinate ac- 
tors accept and fulfil their roles. Gramsci 
(1971), for example, recognized that this could 
be achieved by strategies to co-opt and deprive 
rival groups of leadership, and to infuse their 
world view with evaluative criteria which legiti- 
mate their subordinate position in the social 
order. It is also possible to maintain consent by 
providing material concessions which do not 


undermine the essential nature of the hierarchi- 


cal relationship. 

The hegemony of coercion emerges due to 
the failure, at the margin, to manufacture and 
sustain consent. There emerge a series of “ex- 
ceptional” circumstances — crises of hegemony 
— which cannot be contained by the strategies 
mentioned above. A qualitatively different 
hegemony is, however, possible through the use 
of legitimized repression such as the use of law 
to maintain social roles or scapegoating of iden- 
tiflable groups. Throughout these periods, there 
is a continuing attempt to re-establish the 
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hegemony of consent through the “diffusion and 
popularization of the-world view” of the domin- 
ant group (Bates, 1978, p. 352). Hall et al 
(1978) suggest that the transition from a 
hegemony of consent to a hegemony of coercion 
may occur due to economic recession during 
which the material concessions needed to sec- 
ure consent are withdrawn or decreased. 

The recognition of a shift in the balance of 
consent and coercion within a hegemonic sys- 
tem is analogous to the shift in the relative domi- 
nance of “community” and “state” (or “hierar- 


chic”) principles which would be recognized in’ 


the application of Puxty et ai’s framework to 
corporate systems. The advantage of the former 
approach, however, is its explicit theorization of 
the conditions under which such changes occur 
and the recognition of the conflicts displaced in 
maintaining corporatist structures. The utility of 
this approach to corporatism is demonstrated 
below. 


INTRODUCTION TO THE EMPIRICAL STUDY 


The empirical study focuses on the regulation 
of auditing in Ontario, Canada. The development 
of corporatism in this setting is traced through 
the history of the profession to demonstrate the 
interaction between the development of state 
economic controls and the emergence of an or- 
ganized leadership within the profession. This 
process had as a, partially unintended, conse- 
quence the creation of a status hierarchy in the 
profession reflected materially in access to the 
auditing market. The latter part of the study 


. documents the basis of consent to this social’ 


order in the profession and the current treat- 
ment of dissent through a content analysis of 
documents submitted by various accounting as- 
sociations to a committee (the Professional Or- 
ganizations Committee) established by the state 
to re-establish consensus on the regulation of 
the profession. 


Corporatism in the regulation of accountancy 
in Ontario 
In cross-cultural comparisons of the degree of 
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corporatist development, Canada has generally 
been classified as being only weakly cor- 
poratized (Wilensky, 1976; Panitch, 1979; 
Schmitter, 1981; Schmidt, 1982). While this may 
be true at the federal, “macro” level, corporatist 
arrangements have developed at the provincial, 
“meso” level (Atkinson & Coleman, 1985; Col- 
eman, 1985), particularly within certain sectors 
of the economy. The apparent lack of cor- 
poratism at the federal level and its development 
at the provincial level reflects the powers 
granted to each level of government under the 
Canadian Constitution. By and large, the regula- 
tion ofeconomic activities, including the profes- 
sions, is within the jurisdiction of the provinces. 
The development of corporatism in accoun- 
tancy in Ontario is consistent with the division 
of labour among geographic levels of the state in 
Canada. 

The dual nature of accounting associations in 
interest intermediation and policy implementa- 
tion emerged remarkably soon after the creation 
of the first accounting association in Ontario. 
The Institute of Chartered Accountants of On- 
tario (ICAO) was incorporated in 1883; in 1897 
the Ontario Municipal Act was revised to require 
that municipalities be audited by Fellows of the 
ICAO (FCA) “or other expert accountant”. The 
FCA was the only examined designation in the 
profession; the CA designation could be attained 
by a simple vote of current members. The revi- 
sion to the Municipal Act recognized the CA's 
lobbying efforts to improve the level of financial 
accountability in municipalities but also dele- 
gated the responsibility for establishing approp- 
riate levels of qualifications for auditors, and - 
hence appropriate standards for municipal ac- 
counting, to the ICAO. The Ontario accounting 
profession, thus, from its inception, was brought 
into a corporatist relationship with the state. 

The trend towards corporatist intermediation 
in the profession continued with the CA’s suc- 
cessfully lobbying government on the Com- 
panies Act (1907), becoming intimately in- 
volved with the implementation of income taxa- 
tion (1916, 1917) and being explicitly named as 
auditors of Trust Companies (1919) among 
other examples (cf. Murphy, 1986). The culmi- 
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nation of the growing interdependence of the 
profession and the state in the regulation of 
economic activity was the creation of the Public 
Accountancy Act in 1950 which provided the 
profession, through a Public Accountants Coun- 
cil, with statutory authority to control access to 
public accounting. In the terms of the Act, the 
council’s functions include: 


(a) the grant or refusal of licenses; 

(b) the maintenance and improvement of the status and 
standards of professional qualifications of public ac- 
countants; 

the exercise of disciplinary powers; and 

the consideration of matters of common interest 
and concern to public accountants, and the submis- 
sion of representations to any government ministry 
or public authority with reference to such matters. 


(c) 
(d) 


The creation of the Public Accountants Coun- 
cil also reflected and accelerated a second trend. 
Accountancy was becoming increasingly or- 
ganized into a set of competing professional as- 
sociations, Each of these associations was mak- 
ing demands on the state; for example, submit- 
ting briefs on legislation affecting practice and 
lobbying for increased protection of their own 
position. The composition of the council was 
originally intended to.be representative of the 
population of public accountants and included 
eight CAs, five Certified Public Accountants, one 
Accredited Public Accountant and one member 
of the International Accountants and Executives 
Association. The latter two members of the 
council were considered as representatives ofall 
public accountants outside the two other associ- 
ations. In fact, the council over-represented or- 
ganized interests within the profession which at 
the time consisted of about-50% unaffiliated ac- 
countants (see Table 1). This bias towards the 
existing organizations was explicit. For example, 


Dana Porter, Minister of Education, reasoned . 


that: 


after all, it is the larger organizations... you can deal with 
... it is impossible to know who speaks for the indepen- 
dents because they have no elected body to speak for 
them ... the only way this sort of bill [the Public Accoun- 
tancy Act] can be initiated and brought forward is by 
those who are organized . . . and are able to set it out in 
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some concrete form (statement in the Ontario Legisla- 
ture, April 4, 1950). 


The Council was explicitly intended to create a 
leadership cadre through which the intent of the 
Public Accountants Act could be implemented. 

By 1980, through a series of mergers and the 
relative growth of the public accounting mem- 
bership of the ICAO, 94% of licenced public ac- 
countants were CAs (Table 1) and 12 of 15 seats 
on the Public Accountants Council were oc- 
cupied by CAs. The other three seats were oc- 
cupied by Accredited Public Accountants, a 
group which under the terms of the 1962 Public 
Accountants Act could not accept new members 
and, therefore, was, literally, a dying organiza- 
tion. Although the Public Accountants Council is 
nominally the body responsible for public ac- 
counting in Ontario, much of their standard-set- 
ting and accreditation procedures are delegated 
to the ICAO. The Public Accountants Council 
and the ICAO have integrated virtually all public 
accountants within a single organizational con- 
trol structure. 

The relative positions of CAs and other ac- 
counting associations within the Public Accoun- 
tants Council is also reflected in the market 
place. According to the study conducted by 
Lazer et al. (1978), 92.5% of the public accoun- 
tants in Ontario were CAs with approximately 
40% employed by the “Big Nine” accounting 
firms. These large firms derive most (72% ) of 
their fee income from auditing while smaller 
firms are less specialized in this function. The 
non-CA firms surveyed are much smaller than 


CA firms; the largest of these firms had only three 


partners. The involvement of non-CA firms in 
public accounting is commensurately small. The 
demand for public accounting services is seg- 
mented with large public accounting (CA) firms 
providing all the public accounting services to 
large firms and a significant proportion of audit 
services for smaller clients. Non-CA firms and 
smaller CA firms are specialized in non-audit 
public accounting services for smaller clients (p. 
84). The authors “conclude that a highly com- 
petitive structure exists in the small business 
client segment of the industry and that a quasi- 
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TABLE 1. Licenced accountants in Ontario 














Existing licenses* New licensest 
Year CA CPA Other % CAs CA CGA Other Denied 
1983 7451 — 106 98.6 418 0 0 5 
1982 7172 — 120 98.4 508 0 0 9 
1981 6850 = 277 96.14 509 0 1 19 
1980 6476 — 424 93.9 514 0 0 11 
1979 6074 — 438 93.3 588 0 0 11 
1978 5929 — 460 928 439 0 0 8 
1977 5317 — 488 91.6 547 1 2 9 
1976 5011 — 525 90.5 618 1 1 16 
1975 4586 — 540 89.5 462 2 1 13 
1974 4217 — 565 88.2 410 1 0 21 
1973 3956 — 583 87.2 281 2 0 14 
1972 3749 — 605 . - 86.1 321 2 2 17 
1971 3526 — 629 84.9 314 1 4 14 
1970 3355 — 660 83.6 288 5 1 17 
1969 3215 — 688 82.4 241 0 2 20 
1968 3116 — 703 81.6 223 10 1 11 
-1967 2920 — -722 80.2 264 6 2 16 
1966 2723 — 740 78.6 215 u , 1 15 
1965 2610 — 764 77.4 128 7 0 17 
1964 2535 — 795 76.1 
1963 2533 — 802 76.08 
1962 1767 659 830 54.3 
1961 1647 600 797 54.1 
1960 .1468 518 799 52.7 
1959 1421 463 829 524 
1958 1304 427 832 50.9 
1957 1211 378 856 49.5 1 
1956 1123 323 867 48.6 
1955 1005 323 869 45.7 
1954 842 305 889 41.4 
1953 740 302 881 -38.5 
1952 658 291 858 37.3 
1951 613 295 814 35.6 





*Abstracted from the “Summary of licensees”, Annual Report of the Public Accountants Councils for Ontario, 1965-1983. 
tAbstracted from the “Report of the Applications Committee” Annual Report of the Public Accountants Council for Ontario, 


1965-1983. 
The ICAO granted CAs to 270 others and 60 CGAs. 
$The ICAO granted CAs to 1112 CPAs and 25 CGAs. 


|[Data unavailable; from 1951 to 1964 the activities of the council were reported to licensees by letter. These letters are no 


longer on file. 


monopolistic structure exists in the large busi- 


ness client segment” (p. 105). 

The segmentation of the public accounting 
market has two implications for the present 
analysis. First, the firms which are the target of 
state intervention in the market place (large 
firms) tend to seek public accounting services 


from large CA firms. The implementation of pub- 


lic policy with respect to financial disclosure 
and economic development will occur mainly 


through those firms. Second, the dominance of 
the “Big Nine” in public accounting is likely to 


skew the representational concerns of the Pub- 


lic Accountants Council towards the issues faced 
by those firms. These concerns are likely to be 
distinct from those of practitioners in small firms 
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or operating independently. This is also borne 
out by concerns within the Institute over the 
ability of the ICAO to represent the concerns of 
small firms and independent practitioners. The 
structure of the Public Accountants Council 
thus provides a discrete channel through which 
state policy regarding economic development 
and corporate governance can be implemented. 

The creation of a corporatist structure in pub- 
lic accounting in Ontario has concentrated 
leadership, and market access, within the ICAO 
but the control exercised by the Public Accoun- 
tants Council extends over all public accoun- 
tants regardless of professional affiliation. Since 
the regulation is exclusionary, its domain also in- 
cludes those accountants whose careers skirt 
the public accounting domain or whose career 
path would lead into public accounting. Al- 
though the Public Accountants Council in- 
stitutionalized the leadership role of the ICAO in 
public accounting, this position is an ongoing 
construction being produced and reproduced 
through interactions among associations, in the 
market place and with the state. 


INTRAPROFESSIONAL HEGEMONY 


The development of corporatism in accoun- 
tancy in Ontario has created a problematic rela- 
tionship between those who fulfil the leadership 
roles and certain groups of practitioners. The 
continued existence of this political structure 
requires a degree of consent from those ad- 
versely affected. The boundaries of public ac- 
counting, for example, must be respected if 
costly monitoring and enforcement of restricted 
access is to be avoided. More importantly, those 
occupying subordinate positions must eschew 
the political process as a means of redressing 
perceived inequities. The stability of cor- 
poratism in the Ontario accountancy profession 
is attributed to the CA’s ability to manufacture 
consent based on their “moral and intellectual 
leadership”, Le. hegemony (Gramsci, 1971), of 
the profession. The specific strategies adopted 
by the CAs are considered below. 

The attempt by the CAs to maintain consen- 
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sual control of the profession has been founded 
on two strategies. First, the CAs have actively co- 
opted accountants in public practice by offering 
them membership in the Institute under special 
circumstances such as “close-door” examina- - 
tions or granting entry by a simple vote of mem- 
bership. For much of the CA’s history, this was 
done by identifying specific individuals who 
were active or prominent in public accounting 
to bring into membership. In 1962, however, the 
entire membership of the Certified Public Ac- 
countants Association (almost 1200 people) 
was brought into membership. The CPAs, due to 
their connections with the provincial govern- 
ment, network of public accounting firms and 
educational program, were the only serious rival 
to the CA’s control of public accounting in On- 
tario. The strategy of co-optation (or transfor- 
mism in Gramsci’s terminology) sought to en- 
sure that all practitioners qualified to undertake 
audits were members of the Institute. The 
hegemony of the CAs was thus to be based on 
task distinctions. It was intended that non-CAs 
could acknowledge the superiority of the CAs in 
this task domain and, by corollary, in the profes- 
sion. This strategy, however, turned out to be 
only partly successful as aspirants to audit posi- 
tions continue to emerge within other associa- 
tions. 

The second strategy involved the construc- 
tion of a definition of the accounting profession 
which “naturalized” the status hierarchy in the 
profession. The CAs have, largely by virtue of 
being the first organized body in accounting in 
Canada, defined accounting as a “profession” and 
hence have established the criteria on which 
competence in accountancy would be deter- 
mined. The relevance of formal education, 
examinations, apprenticeships, ethical controls 
and peer reviews is now widely accepted by all 
accounting associations (Richardson, 1986). Al- 
though there are occasional attempts to redefine 
some criteria, such as accepting experience 
gained in any accounting setting rather than just 
public accounting as a basis for eligibility for 
membership, by-and-large the competing posi- 
tions taken by other associations are contained 
within an “ideological space” (Hall, 1977) de- 


424 


fined by the CAs. There have been attempts by, 
these associations to outspace the CAs in the 
adoption of professional attributes but the CAs 
have taken the explicit stand that they will main- 
tain their standards at the forefront of the profes- 
sion (Richardson, 1987). 

This strategy, however, is self-limiting. The 
criteria used are easily (although not costlessly ) 
mimicked by subordinate groups and the level of 
credentials which can be realistically demanded 
cannot be raised beyond those adopted by com- 
parison groups in society (notably other profes- 
sions). As the CAs reach the pinnacle of creden- 
tialism and other associations continue to up- 
grade their requirements, the rhetorical power 
of “professionalism” to distinguish among associ- 
ations is eroded. At this strategy plays itself out, 


the hegemony of the CAs becomes increasingly 


tenuous. 

The CA’s dominance in the profession, which 
underlies its corporative role, rests on contested: 
hegemonic principles. The working consensus 
developed among associations and with the state 
over the division of labour in the profession and 
the place of each association within that struc- 
ture is unstable and alternative ways of organiz- 
ing the profession have been proposed. The fol- 
lowing section documents the range of alterna- 
tive principles advanced as normative bases for 
organizing the profession. The struggle to re-es- 
tablish a consensus in the profession is seen to 
occur within a legal framework which const- 
rains some interests and enables others. The ba- 
lance of coercion and consent reflected in these 
events is systematically biased towards the 
maintenance of corporatism in the profession. 


Contesting professional hegemony 
The legitimacy of the Public Accountants 


Council’s control of public accounting in On- 
tario has been subject to recurrent challenges 
from those excluded from access to the audit 
market. The most serious of these challenges has 
come from the Certified General Accountants 
Association of Ontario (CGAAO). In negotia- 
tions leading up to the creation of the Public Ac- 
countants Act in 1950, the basic premise was 
that each association would be represented on 
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the Council in proportion to their ‘public ac- 
counting membership. Associations thus rep- 
resented would be deemed “qualifying bodies”; 
any individual gaining that association’s designa- 
tion would be qualified to receive a licence to 
practice public accounting. While in early drafts 
of the Act the CGAAO was included as a qualify- 
ing body, the final version opted to bring exist- 
ing CGAs in public practice into the ICAO and 
deny the CGAAO the status of a qualifying body 
(Creighton, 1984). Subsequently, a public ac- 
counting membership reformed within the 
CGAAO leading its executive to repeatedly put 
forward claims for access to the public account- 
ing field. In 1957 and 1961 the CGAs petitioned 
to be named a qualifying body under the Public 
Accountants Act. These petitions were with- 
drawn when CGAs then in public practice were 
given CAs as part of the 1962 CA/CPA merger 
and subsequent amendments to the Act. In 1974, 
the CGAs suggested broad changes in the com- 
position of the Public Accountants Council 
which would give each association representa- 
tion on the Council and, therefore, access to 
public practice. This challenge, as yet unresol- 
ved, provides the context for the following 
analysis. 

The 1974 petition coincided with similar 
problems in other professions (e.g. engineering, 
architecture) and the Attorney General’s office 
created the Professional Organizations Commit- 
tee (POC) to consider these situations. Specifi- 
cally the POC was given a mandate to consider: 


(1) the appropriateness of the existing division of func- 
tions and jurisdictions of these professional groups,... (2) 
the possible creation of new professional groups and sub- 
groups or the amalgamation of groups within these pro- 
fessions;... (3) the need for recognition and definition of 
roles of paraprofessionals;... (4) the amount of control 
these professional groups should have over the 

and certification of their members;..” (Leal et al, 1980, 
p-1). 


The POC provided a forum in which the CGAs 
could air their views and, hence, forced the 
ICAO to legitimate its position in the profession. 
The ICAO assured members that it would op- 
pose the CGA petition. In its 1977 annual report 
the Institute openly recognized that professions, 
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in general, were facing pressure to allow greater 


competition. The amendment of the Federal’ 


combines Investigation Act (concerning the cre- 
ation of monopolies contrary to the public in- 
terest) in 1976 to include service industries and 
the recommendation of the McRuer Commis- 
sion on Civil Rights (1967) that no professional 
body be allowed to maintain dominance over 
groups in the same field are examples of these 
pressures. In spite of these indicators of the 
times, the ICAO remained confident that it 
would fare well under the, POC study. It was 
even suggested that the current state of affairs in 
the profession reflected these values. 

In 1978, the ICAO Annual Report noted that 
the composition of the POC favoured their view 
(being dominated by fellow professionals, i.e. 
lawyers) and the President of the ICAO assured 
members that the status quo would be main- 
tained. By 1979, however, the ICAO found that 
the POC staff study was considering alternatives 
contrary to their prior expectations. The study 
adopted an economic perspective to determine 
the appropriate organization of accounting. This 
perspective implicitly rejects the CA’s claim to 
be a profession and subjects it to economic tests 
of market efficiency rather than sociological 
tests of professionalism. 

Four “principles” were used by the POC staff 
to assess the current situation and possible alter- 
natives: (1) the protection of vulnerable in- 
terests; (2) fairness of regulation; (3) feasibility 
of implementation; and (4) public accountabil- 
ity. In the words of the staff report (Trebilcock et 
al, 1979, p. 40): 


.. one cannot simply proceed on the basis of which in- 
terests one wishes to favor over others. Rather, one re- 
quires generally accepted principles to apply such that 
the ultimate judgments about the balancing of competing 
claims will command widespread support. È 
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It is clear, however, that the principles adopted 
by the POC were not those widely beld within 
the profession but were consistent with the CGA 
petition. For example, the “fairness of regula- 
tion” principle was defined to include fairness to 
aspiring professionals and thus put the POC on 
the side of the CGA’s demands for greater open- 
ness of public accounting. Similarly, the “feasibil- 
ity of implementation” principle was interpreted 
as limiting any regulatory actions to those that 
would be supported by all “organized interests” 
which, again, provided support for the CGA 
claims (cf. Leal et al, 1980, Chapter 1). The fol- 
lowing year the committee’s recommendations 
were put before the legislature and the ICAO ex- 
pressed their opposition. 

The debate over the legitimacy of the domi- 
nance of the CAs centred around the principles 
which should be used as normative bases for or- 
ganizing the profession. The principles used by 
each party during this challenge were examined 
through a content analysis of documents pre- 
sented to the POC. This analysis is summarized 
in Table 2. 

The two main parties in the dispute were the 
CAs and the CGAs. The Public Accountants 
Council (PAC) also submitted a brief; however, 
this only received the approval of the CA mem- 
bers of the council and a minority brief was also 
submitted by the elected members of the Coun- 
cil (all APAs). The view of four groups — CAs, 
CGAs, APAs and the PAC — are thus represented 
in this analysis. 

The principles cited by the CGAs contrast 
markedly with those cited by the CAs. In the se- 
venteen categories used in the analysis,’ there 
are only four categories which are cited by both 
the CGAs and the CAs, Seven categories are cited 
only by the CGAs, while the remaining six were 
cited by the CAs but not by the CGAs. The POC 


‘The documents were initially reviewed to identify the principles cited by participants as sanctionable bases for their 
positions. These categories were then used in content analysis procedures. The analysis used the “theme” of the paragraph, 
i.e. principle invoked, as the coding unit (Holsti, 1969). The reliability of the coding procedures used in this analysis was 
tested in two ways: test-retest reliability of the main coder was,0.88 (over twenty pages of text with three months between 
codings), and inter-rater reliability was 0.67 (Holsti, 1969; 0.47 when corrected for base rates according to the Scott index 
of reliability). Details of the procedures are provided in Richardson (1985, Appendix B). 
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hearings thus became the terrain for a clash of 
competing hegemonic principles. The accep- 
tance of either set of principles had clear impli- 
cations for the structure of professional regula- 
tion. 

The CAs argued that the organization of the 
profession should be based on a single, high 
standard of competence administered by a 
single body within the public accounting field. 
These principles are also reflected in the brief 
prepared by the PAC who felt that “the training 
and examination standards set by the ICAO are 
sufficiently high to ensure a fair degree of pro- 
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tection to the public and that this standard 
should not be lowered.” The Council, although 
dominated by appointees of the ICAO, re- 
sponded to the CGA petition as an independent 
body, noting that “since the Petition deals with 
matters that are the responsibility of this Coun- 
cil, it is considered appropriate that Council 
should express its view.” 

The CGAs, by contrast, argued that entry to ac- 
counting should be possible through a variety of 
educational and career paths based on a func- 
tional specification of the necessary level of 


competence. In particular, they argued against 


TABLE 2. Percentage composition of principles cited in documents submitted to the POC* 






























































Body citing 
Standards CA CGA APA PAC 
Equity: 
(1) of treatment under existing norms 0 9.1 38.5 0 
(2)ofexisting norms 0 3.0 7.7 0 
(3) of representation 0 6.1 7.7 0 
(4) of opportunity to practice (based a realistic 4.1 18.2 7.7 6.3 
assessment of skill needs) 
l 4.1 36.4 61.6 6.3 
Unity: 
(1) of standards 16.3 0 11.5 18.8 
_ (2) of public accounting 4.1 (6 3.9 12.5 
` (3) ofaccounting overall 0 6.1 0 0 
20.4 6.1 15.4 313 
Public interest: ` 
(1) responsive to community/government 8.2 21.2 3.9 6.3 
(2) choice among practitioners 0 6.1 3.9 0 
(3) high standards of practice 28.6 9.1 3.9 37.5 
(4) continuing education/discipline 10.2 12.1 0 12.5 
(5) unspecified 0 3.0 0 0 
47.0 51.5 11.7 56.3 
Consistency: 
(1) with laws 4.1 o 0 0 
(2) with past agreements 10.2 0 0 0 
14.3 0 
Opportunity: 
(1) for social mobility (multiple paths) 0 6.1 11.5 0 
(2) for social mobility (single path) 14.3 0 0 0 
14.3 6.1 11.5 0 
Efficiency of regulation: 0 0 o 6.3 
Number of times principles cited 49 33 26 16 
Number of paragraphs reviewed 115 71 38 23 





*Column totals may not sum to 100% due to rounding. 
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educational requirements such as the CA’s de- 
gree requirement and apprenticeship in public 
practice, which restricted access due to pros- 
pective accountants’ resources, family back- 
ground or other non-task related criteria. The 
CGAs tied their proposals to the government’s 
` call for greater access to education and more oc- 
cupational training (e.g, through the Occupa- 
tional Training Act, Statutes of Ontario, 1967; 
and the Technical and Vocational Training Assis- 
tance Act, Statutes of Canada, 1965). 

The CGAs also called for representation of all 
accountants, regardless of speciality, on any con- 
trolling body. They support the existence of 
multiple accounting bodies, arguing that the 
combination of market place competition and 
collegial control is more effective than collegial 
control alone in safeguarding the public interest. 


Along with this view is the contention, contrary 


to the CAs, that the public is able to differentiate 
between accounting designations (see Perry, H. 
{Executive Director, CGAAO] CGA Magazine, 
V10, pp. 27—28). 

The three APA members of the Public Accoun- 
tants Council submitted a minority brief on the 
CGA petition. As shown in Table 2, while reaf- 
firming the principles of unity and high stand- 
ards, the APAs suggested a different institutional 
framework for their enactment. They suggested 
that the Public Accountants Council, reconsti- 
tuted to eliminate the dominance of the ICAO, 
be given “the power to determine the qualifica- 
tions necessary for licencing and responsibility 
for establishing and maintaining standards of ex- 
cellence required of licensees.” The APA's brief 
argues that the present Council has not exer- 
cised its prerogatives in an equitable manner. 
They describe the Council’s actions as an “abuse 
of power.” 

In areas where the CGAs and CAs agree in 
principle, there are differences in emphasis. For 
example, while both clearly recognize that the 
profession must safeguard the “public interest”, 
the CAs see this as being accomplished through 
high standards of initial competence and prac- 
tice (set and maintained by the profession), 
while the CGAs see this as being accomplished 
through competition among practitioners and 
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responsiveness to environmental forces (par- 
ticularly government policy). 

Interestingly, the CGA’s call for representa- 
tion of all associations on any board set up to 
control entry to public accounting was rejected 
by the Society of Management Accountants of 
Ontario (Certified Management Accountants, 
CMA), who were not included in the discussions 
and indicated (through the ICAO) that they did 
not desire such a role. The CMA annual reports 
of the period, however, indicate a less clear cut 
response to the proposal. They express a willing- 
ness to “accept the responsibility of licencing... 
members” and provide “all necessary support 
facilities.” 

The Report of the POC released in April, 1980 
describes the situation in the profession as “a his- 
tory of recurring outbursts of organizational 
rivalry and recurring attempts to snatch har- 
mony from the jaws of discord.” The Report re- 
commended that: 


(1) the CGA should be incorporated by statute (rather 
than letters patent) and should be given the CGA as a re- 
served title; (2) each of the CMA, ICAO and CGA should 
be empowered to grant licenses; (3) all current public 
accountants, not members of either the CMA or ICAO, 
should become members of the CGAAO; (4) the defini- 
tion of public accounting should be expanded to include 
any action designed to add credibility to financial irifor- 
mation; (5) each association adopt and enforce the CICA 
handbook as the standard of auditing practice; (6) public 
accountants should be licenced by passing a common 
examination set by a Public Accounting Licencing Admis- 
sions Board (PALAB) composed of representatives of 
each association, colleges and universities, and the On- 
tario Securities Commission; and (7) the ICAO transfer 
its uniform final examination role to the PALAB to serve 
as the common examination for public accountants. 


These recommendations, while “recognizing 
the pre-eminence of the ICAO in the profession”, 
would remove the dominance enjoyed by the 
ICAO. Overall, the implementation of these re- 
commendations would move the profession 
from a corporatist towards a more liberal mode 
of regulation. 

In the year following the release of the report, 
the unaffiliated public accountants, who had not 
been contacted by the POC during its delibera- 
tions, approach the ICAO seeking membership. 
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The ICAO regarded this as evidence of the pre- 


ferences of this group on the organization of the 


profession. “They wished to become CAs and; 


they wanted the Institute to remain the only 


qualifying body for public accounting licenses.”. 


The ICAO saw granting this group membership 


as being “in the public interest and... totally con- 


sistent with our commitment to a single stand- 
ard”. A motion accepting them into membership 


without examination was passed by an over- 


whelming majority (5070 to 930; see Table 1). 


The CGAs, however, saw the same event as “an 
apparent attempt to undermine the POC recom- 
mendations.” Specifically, the move con- 

_tradicted the recommendation that all unaf- 
filiated public accountants become CGAs. The 
APAs merged with the CGAAO in 1981 thus giv- 
ing the CGAs representation on the Public Ac- 
countants Council. In 1982, the CGAs, following 
the POC report, applied for incorporation by sta- 
tute. Their bill was opposed by the CAs, CMAs 
and PAC who succeeded in having auditing de- 
leted from the association’s list of educational re- 
sponsibilities, and in having a clause added 
which denied CGAs the right to practice as pub- 
lic accountants. The ICAO recognized that this 
Act would allow the CGAs to “obtain the en- 
hanced credibility that a statutory umbrella 
would provide.” In recent advertisements, the 
CGAs continue to list auditing as one of the pos- 
sible careers for its graduates. The state has not 
acted on the POC recommendations. In the ab- 
sence of such action the Public Accountancy Act 
continues in force. 


DISCUSSION 


In this article the institutional structure 
within which public accounting in Ontario is 
represented and controlled has been typified as 
a form of corporatism. The concept of cor- 
poratism is used as a description of 


apolitical structure within advanced capitalism which in- 
tegrates organized socioeconomic producer groups 
through a system of representation and co-operative 
mutual interaction at the leadership level and mobiliza- 
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tion and social control at the mass level (Panitch, 1980, 
p.173). 


Corporatism has thus been approached as an ac- 
tual practice, the nature and stability of which is 
subject to empirical rather than theoretical 
analysis. Consistent with this, the article has fo- 
cused on the nature of the “representational 
monopoly” presumed in the creation of the Pub- 
lic Accountants Council and has examined the 
way in which consent to this structure is man- 
ufactured and dissent is managed within the pro- 
fession. 

The analysis of consent in the profession has 
drawn on Gramsci’s (1971) theory of 
hegemony. It was argued that, historically, the 
CAs have maintained their leadership in the pro- 
fession through the co-optation of qualified pub- 
lic accountants (transformism) and through the 
elaboration of an ideology of professionalism 
which captured the central values of subordi- 
nate groups, thereby naturalizing their position 
in the profession (expansive hegemony). The 
central analysis deals with a contemporary event 
in which the corporatist structure of regulation 
and the dominance of the CAs within the profes- 
sion were challenged and, by virtue of inaction 
by the state, i.e. their failure to implement POC 
recommendations, reproduced. 

The analysis illustrates a source of instability 
in this mode of regulation in the accounting pro- 


fession, namely the clear illiberalism of the cor- 


porate form and the CA’s hegemonic principles 
which conflict with the principles of subordi- 
nate groups and the espoused principles of the 
pluralist democratic state. The POC report rec- 
ognized this conflict and recommended in 
favour of the abolition of the Public Accountants 
Council and the encouragement, in spite of obvi- 
ous market forces to the contrary, of a public ac- 
counting membership in each of the major pro- 
fessional associations. The inability of the POC 
to re-establish consensus in the profession 
means that the dominance of the ICAO con- 
tinues on the basis of law but with the continu- 
ing, and growing, opposition of an organized 
subordinate group. The period analyzed is thus 
one in which a change from a hegemony of con- 
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sent to a hegemony of coercion has occurred. 

The analysis presented here challenges and 
complements recent attempts to theorize ac- 
counting regulation. Puxty et al (1987), for 
example, propose that accounting regulation 
can be understood within the mode of social 
order developed by Streek & Schmitter (1985; 
see also Willmott, 1984). Accounting regulation 
is presented as the historically specific conjunc- 
tion of state, community and market modes of 
social order. The author suggest that the model 
developed by Streek & Schmitter can be ex- 
panded by focusing on accounting regulation as 
a dynamic process and by inquiring into the con- 
stellation of material and ideological forces 
which affect the origin, reproduction and trans- 
formation of accounting regulation. 

Puxty et al claim that to be useful the Streek 
& Schmitter model must be dynamic, that is, it 
must theorize the interaction between princi- 
ples of social order and the evolution of forms of 
regulation. The present work suggests one ap- 
proach to theorizing the relationship between 
the profession and the state by recasting the 
“spontaneous solidarity” (pp. 278—279) of the 
professional community in terms of the theory 
‘of hegemony. As defined in Puxty et al, spon- 
taneous solidarity occurs through socialization 
or the commonality of personal values. What is 
missing from this definition is a sense of how val- 


ues are formed or the relationship between val- . 
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ues and the social context in which the profes- 
sion operates. The theory of hegemony stresses 
the process by which the values of the commun- 
ity are managed and ties these processes to 
issues of social and economic dominance. 

The theory of hegemony as used in this paper 
draws attention to the conflicts which underlie 
the solidarity of the profession. The profession is 
one terrain for the clash of alternate hegemonic 
principles in society. The “spontaneous solidar- 
ity” of the profession at any point in time reflects 
the historically contingent dominance of one set 
of hegemonic principles. The state, moreover, is 
not neutral in this struggle, but rather privileges 
the hegemony of particular groups as a means of 
co-ordinating the actions of the profession and 
the state. The theory of hegemony thus provides 
a way of understanding the connection between 
the state and communal principles of order in 
the profession. 

In conclusion, the present study contributes 
to our understanding of the nature of accounting 
regulation in advanced capitalism. It provides 
empirical insights which may inform the at- 
tempts to theorize the nature of accounting reg- 
ulation such as that of Puxty et al. (1987). The 
complexity of the structure of regulation both 
within any country and the variation between 
countries suggests that further empirical and 
theoretical work is necessary to achieve an 
adequate theorization of the phenomenon. 
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Abstract 


The purpose of this paper is to examine the empirical testability of agency theory from a falsificationist 
perspective. Following a brief discussion of economic methodology, the paper examines three main classes 
of agency models with a view to identifying the scope for empirical testing. With regard to the basic single 
period agency model, the paper argues that empirical researchers should direct their attention to 
generating and testing the comparative static implications of the model. For agency models with post- 
decision information, it is argued that the scope for empirical testing is likely to be severely limited by the 
researcher's inability to validate the truth value of the auxiliary hypotheses needed to generate emplrically 
interesting implications. In addition, it is argued that considerable care needs to be exercised when 
interpreting the results of tests of “if and only if? propositions. These difficulties of interpretation are 
illustrated by reference to two recent attempts to test Holmstrom’s theory of relative performance 
evaluation. Finally, it is argued that agency models involving pre-decision information are practically 
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devoid of empirical content. 


In the accountancy literature agency theory has 
been given two complementary interpretations. 
On the one hand it has been interpreted as a nor- 
mative theory of accounting, i.e. as a means of 
determining economically efficient accounting 
practices. On the other hand it has been inter- 
preted as a positive theory which explains and 
predicts contracting and monitoring structures 
in situations involving risk and information 
asymmetry. 

The main achievements of the normative liter- 
ature centre around the identification and analy- 
sis of two major types of problem which arise in 
an agency context, both of which stem from 
some form of information asymmetry between a 
relatively well informed agent and a relatively 
badly informed principal: moral hazard prob- 
lems and adverse selection problems. A moral 
hazard problem arises when the principal can- 
not observe the agent’s action selection and 
when the preference rankings of the principal 
and the agent over the set of alternative actions 
diverge. An adverse selection problem arises 





when an agent has access to information prior to 
his or her action choice which cannot be ob- 
served by the principal. Both the moral hazard 
problem and the adverse selection problem can 
be overcome by the provision of improved pub- 


_ lic information and one major topic of inquiry in 


the accounting agency literature has been anat- 
tempt to identify the conditions under which an 
improvement in such information leads to a 
strict improvement in the welfare of both the 
agent and the principal. 

The positively oriented agency literature has 
largely focused on attempts to rationalise empir- 
ically observed contractual arrangements. Sev- 
eral real life contractual relationships can be 
rationalised by reference to agency-theoretic 
concepts (e.g. those between insurer and in- 
suree, client and lawyer, patient .and doctor, 
etc.). In addition, the accounting and finance lit- 
erature has identified the contractual relation- 
ships between corporate insiders (say company 
directors) and corporate outsiders (outside 
shareholders and bondholders ) and the contrac- 


*I am grateful to Chris Green, John Grinyer, Alasdair Lonie, Bill Nixon and two anonymous referees for many helpful 


comments. 
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‘tual relationships between subordinate 
employees and their superiors as particularly 
promising areas for empirical research on 
agency theory. 

The purpose of this paper is to examine the 
scope for empirical work designed to test the 
' theoretical propositions of agency theory. In 
particular the paper examines the scope for de- 
vising critical tests of agency theory along the 
lines advocated by the 
standpoint known as falsificationism (Popper, 
1959, 1972). This particular methodological 


standpoint is adopted purely because it helps to, 
throw the empirical context of agency theory’ 


into sharp relief. Nothing in this paper is in- 


tended to imply that other methodologies are in- 


valid or even inferior to falsificationism. Indeed 
many of the points discussed below serve to 
highlight the extreme difficulties one encoun- 
ters in applying the strict tenets of fal- 
sificationism to a social science like economics. 
Such difficulties are well known to economic 


methodologists (e.g. Caldwell, 1984; and Blaug,: 
1980) and a subsidiary purpose of the present, 


. paper is to provide concrete illustrations of 
these difficulties. 
The next section pais an introduction to 


the agency literature and sets out a taxonomy of. 
.alternative agency models. This taxonomy pro- - 
vides a useful organisational framework for the 


discussion of the final section which focuses on 
issues relating to the testability ofagency theory. 
The subsequent section provides an introduc- 
tion to falsificationist methodology and outlines 
the comparative static approach towards testing 
economic theory. The final major section 
examines the testability of agency theory from 
such a falsificationist perspective. 


J 


INTRODUCTION TÒ AGENCY THEORY 


Agency theory is, primarily concerned with 
contractual relationships under uncertainty. 
Most agency. models are concerned with a con- 
tractual relationship between a single principat 
and a single agent where the former engages the 
latter to carry out certain activities which we 
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will refer to as effort. The role of the principal 
within the agency relationship is to design a re- 
ward contract for the agent, bearing in mind 
three main considerations. First, the contract 
must be sufficiently attractive to prevent the 
agent offering his or her services elsewhere. 
Second, the contract should be framed in sucha 
way as to provide the agent with an incentive to 
exert the required level of effort. Third, the con- 
tract must be enforceable on the basis of infor- 
mation which is observed both by the principal 
and the agent. Thus, for example, if both the 


‘principal and the agent can observe the payoff 


from the agent’s effort, but the principal cannot 
observe the agents effort, then the contract 
must specify the agent’s reward as a function 
only of the jointly observable payoff. 

Several agency models have appeared in the 
literature. These models differ from each other 
mainly in the assumptions they make about the 


' timing and distribution of information flows. In 


addition some models implicitly rule out com- 
munication between the agent and the principal 
whilst others do not. The majority of models 
published to date can be represented as special 
cases of the general model illustrated in Fig. 1. 
The time line in Fig. 1 shows (beneath the 
line) three critical events in the life of the agency 
relationship: the time point when the contract is 
agreed, the time point when the agent decides 
his or her effort level, and the time point when 
the payoff becomes known. The dotted arrows 
above the time line identify the timing and dis- 
tribution of information flows. .The variables 
labelled Y,,, Ypi, Yao, Yp2 Tepresent information 
signals which improve the recipient's ability to 


‘predict and/or verify the true state of the world. 


An “a” subscript represents information pri- 
vately received by the agent, and a “p” subscript 
represents information which is observed’ by 
both the agent and the -principal. Thus, for 
example, Y, represents information received by 
the agent (but not the principal) before making 
his or her effort decision. The variables labelled 
M, and M, represent messages sent by the agent 
to the principal. In general these messages will 
depend on information received by the agent 
before the time he or she sends the message. The 
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Fig. 1. Time line of the general two-person agency model. 


variable X represents the payoff from the agent’s 
effort and RÇ.) represents the agent’s reward 
function. l 

The main role of the principal in this model is 
to design a reward function for the agent. In gen- 
eral this function can be based on any variable 
which is observed by both parties, i.e. Yp, Yp2,X, 
M, and M7. Some agency models also assume that 
the principal can influence the information sys- 
tems which generate. one or more of the signals 
Yar, Yp Ypz and Y,2. In such cases the principal 
must design both the reward function and the 
set of information systems. l 

Some agency models allow the principal to 
delay his or her choice of post-payoff informa- 
tion system (Le. the information system generat- 
ing Y,,) until the payoff level is known. In such 
cases the role of the principal, at the time the 
contract is agreed, is to design both the reward 
function of the agent and an “information selec- 
tion strategy” which specifies a decision rule for 
- selecting the post-payoff information system 
conditional on the observed payoff. 

Figure 2 provides an overview of the major 
special case of this general model which have 
been analysed in the literature to date. The fig- 
ure divides agency models into six main cate- 
gories depending on the timing and distribution 


of information. Of these six categories only four 
are of any significance: l 

(i) private pre-decision information received . 
prior to the agent’s effort decision; ; ` 

(ii) public pre-decision information received 
prior to the agent’s effort decision; 

(iii) private post-decision information re- 
ceived after the agent's effort decision but — 
before the payoff becomes known; and 

(iv) public post-decision information re- 

ceived after the agent’s effort decision. 
Within category (i), Fig. 2 distinguishes between 
models in which communication is possible and 
models in which communication is not possible. 
Within category (iv), Fig. 2 distinguishes models 
which allow the decision to acquire post-payoff- 
information to be made conditional on the 
realised payoff from models in which the post- 
payoff information function is assumed to be 
given exogenously. 

Figure 2 contains several items of information 
which may prove helpful for following the dis- 
cussion in the final section. For the categories 
(iH iv) inclusive, Fig. 2 provides references to 
the seminal articles associated with that cate- 
gory and identifies (by letters in the range A-S) 
the propositions which will be discussed in de- 
tail in the final section. The first line of each cate- 
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Fig 2. Agency models with information. 


gory in Fig. 2 shows the main information and 
message variables involved in the models of that 
category and identifies the variables which are 
used to determine the agent’s reward, i.e. the ar- 
guments of the reward function. 

An important special case of the general 
model which is not captured by Fig. 2 is the basic 
agency model in which there is no public or pri- 
vate information. In this special case the role of 


the principal is to design a reward function 
which depends only on the-observed payoff. In 
effect the basic agency model can be viewed asa - 
special case of category (iv) in which no public 
post-payoff information is received other than 
the observed payoff. 

Figure 3 presents a formal mathematical state- 
ment of the basic agency model which also helps 
to establish notation for the discussion in the 
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Fig. 3. The basic agency model. 


final two sections. The role of the principal in 
this model is to choose a desired level of e and a 
reward function R(x). The model assumes that 
the principal chooses R(x) and the desired level 
of e to maximise his or her expected utility (de- 
fined by equation A). Equations B and C repre- 
sent the two constraints faced by the principal 
when choosing R(x) and the desired level of e 
First, the reward contract must be sufficiently at- 
tractive to the agent to prevent him or her from 
offering his or her services elsewhere. This con- 
straint is expressed mathematically by equation 
B which requires the agent’s expected utility 
under R(x) and the desired level of e to be at 
least as preat as the agent’s opportunity utility. 
Second, since the agent actually controls e, the 
principal’s desired level of e must bé a utility 
maximising choice for the agent, given the re- 
ward function R(x). This idea is captured by 
equation C which requires the desired level of e 
to be an optimum solution to the problem of 


maximising the.term inside the square brackets, 
i.e., the agent’s expected utility. This constraint 
refiects the assumption that the agent will select 
his or her effort level so as to maximise his or her 
own expected utility. 

The final section of the paper examines the 
empirical testability of the models introduced in 
this section. It argues that agency models with 
post-decision information offer some, though li- 
mited, scope for empirical testing. Agency mod- 
els with pre-decision information, it is argued, 
are practically devoid of empirically testable im- 
plications. First of all, however, it is necessary to 
discuss the methodological bases on which 
questions of testability can be considered. 


METHODOLOGICAL GROUND RULES 


In discussing the empirical content of agency 
theory Baiman (1982) argued that the approp- 
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riate test of any economic theory must be a com- 
‘parison between its implications and current 
practice. In particular, he argued, before agency 
theory can be accepted as a proper theoretical 
foundation for management accounting it must 
be capable of explaining current practice in the 
sense that the demands for accounting that are 
observed in current practice should be deriva- 
ble as an implication of the theory. It is import- 
ant to note that it is only the empirical implica- 
tions of the theory that need to be subject to em- 
pirical testing. Agency theory is an abstraction 
from reality and, like all useful abstractions, it in- 
evitably involves several unrealistic assump- 
tions. Nevertheless it has been argued that the 
realism of a theory’s assumptions does not mat- 
ter provided its empirically testable implications 
are well corroborated. Friedman (1953) pro- 
vides the classic defence of this methodological 
position. 

For sake of argument, this paper accepts as its 
basic methodological ground rule the view that 
a theory should be evaluated according to the 
descriptive accuracy of its implications. The de- 
cision to adopt this ground rule, however, im- 
poses a responsibility on the researcher to 
specify a procedure for testing the validity or 
otherwise of a theory’s implications. The rest of 
this paper will examine the predictions of 
agency theory from the methodological 
standpoint known as falsificationism. According 
to this standpoint theories are regarded as scien- 
tific ifand only if their implications are, at least in 
principle, capable of being falsified (Popper, 
1963).’ Moreover, a theory is only regarded as 
having been tested when the researcher 
specifies in advance the observable events that 
would falsify the theory. Ifa theory succeeds re- 
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peatedly in resisting falsification and if it exp- 
lains results that cannot be explained by com- 
peting theories, it is judged to be well corrobo- 
rated.” 

The decision to adopt falsificationism as the 
standard for judging the scientific status of 
agency theory is, admittedly, open to question. 
Several worthy scholars have argued that the 
harsh standards of falsificationism are inapprop- 
riate in a soft social science like economics (e.g. 
Caldwell, 1984). Even strong advocates of fal- 
sificationism in economics such as Blaug (1980) 
and Hutchison (1938) admit that the strict 
tenets of falsificationism have rarely been prac- 
tised even by the “best” economists. Moreover, 
in the specific context of agency theory, it could 
be argued that the development of the theory is 
still in the infant stage and, therefore, that any at- 
tempt to devise empirical tests of the theory is 
premature. 

However the only practical way to find out 
whether or not falsificationism is applicable to 
economics is to attempt to apply the approach to 
a large number of economic issues. Thus one 
views this paper, to some extent, as an attempt to 
explore the applicability of falsificationism to 
the economics of agency. Also the claim that 
economists have eschewed falsificationism in 


the past, even if true, provides no logical 


grounds for arguing that it should not be tried in 
the future. Finally, three points can be made with 
regard to the “infant theory” argument. First, it is 
well over a decade since the earliest theoretical 


.articles on agency theory and property rights lit- 


erature started to appear. Second, the argument 
ignores the fact that agency theorists have a ten- 
dency to make impressive claims for the positive 
and normative significance of the approach.> 


! In Popper’s words (p. 278): “While what can in principle be so overthrown and yet resists all our critical efforts to do so 
may quite possibly om o near ost ce rer aan net rece fee ene ee 


— though only tentatively.” s 


? Strictly speaking, we subscribe to “sophisticated falsificationism”, a statistical view of testing that accepts that neither 
refutation nor confirmation can be final and that all we can hope to do is to discover what is the balance of probabilities 


between competing hypotheses. 


For example, Baiman (1982) concludes: “In summary, the initial results of agency research do support the assertion that 
the agency model will be a fruitful tool for future rescarch in managerial accounting and may, indeed, provide a framework 
from which a useful theory of managerial accounting can be derived.” 
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Surely agency theorists can’t have it both ways. 
Either agency theory is an infant theory with lit- 
tle immediate practical application or it is a 
major theory of practical significance capable of 
withstanding the rigours of empirical scrutiny. 
Third, it can be argued that even infants need ex- 
posure to the real world if they are to develop 
properly. Agency theory has reached a state 
where there are numerous alternative paths 
along which more and more mathematically 
sophisticated models could be developed. At- 
tention to the empirical limitations of existing 
models may assist agency researchers to identify 
the most promising paths for future develop- 
ment. 

Bearing the above points in mind, the remain- 
der of this paper will examine the empirical tes- 
tability of agency theory from a falsificationist 
perspective. According to the falsificationist 
standpoint the testing of a theory involves three 
main steps: 

(1) establish the implications of the theory, 

(2) devise a critical test which is capable of fal- 
sifying the theory; 

(3) perform the test and interpret the find- 
ings. 

With regard to the first step it is important to 
note that the implications of a theory are state- 
ments which rule out the occurrence of certain 
observable events. They are universal state- 
ments which assert that certain events cannot 
occur. The power of such statements stems from 
the fact that they can be directly refuted by a 
single occurrence of the event they rule out. 
Such universal statements stand in contrast to 
existential statements such as “there exist white 
ravens” which can never be falsified, since no 
matter how many black ravens are observed one 
can never rule out the possibility of a single 
white raven being in existence. 

Many of the difficulties involved in testing 
economic theories stem from the conditional 
nature of their empirical implications. In par- 
ticular the usual practice in economics is to de- 
rive the implications of a change in one or more 
exogenous variables for one or more endogen- 
ous variable conditional on the values of the 
other exogenous variables being held constant. 
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The empirical testing of such hypotheses runs 
into four main problems. First, it is often difficult 
to obtain. data sets in which the ceteris paribus 
conditions hold. More often than not the re- 
searcher is faced with data where all the exogen- 
ous variables exhibit change at once. Second, the 
researcher is often faced with the possibility that 
some other, unobservable, exogenous variable 
may be driving the results. Third, there is the 
problem of devising acceptable proxies for the 
real world counterparts of the variables rep- 
resented by the model. Finally, even if the proxy 
variables have been chosen correctly, there is al- 
ways a possibility of measurement error in the 
observed values of the proxy variable. 

The second step in the testing process re- 
quires one to devise a critical test of the theory, 
ie. one which is capable of falsifying the theory. 
It is at this stage that the problem of observabil- 
ity looms large. Frequently the researcher’s abil- 
ity to devise a critical test will depend crucially 
on which features of the real world can be ob- 
served, This issue is particularly vexing in the 
context of agency theory because the central as- 
sumption driving the novel implications of 
agency theory is an asymmetry of information 
between the principal and the agent. Agency 
problems arise because the principal has limited 
observational powers. Researchers must, there- 
fore, work under the assumption that the infor- 
mation on which they will be able to base their 
field tests will be limited, at best, to the informa- 
tion set observed by the principal 

If researchers could observe the agent’s effort 
level and all the functions and parameters of the 
agency model (i.e. the utility functions of the 
agent and the principal, the set of feasible effort 
levels, the function f(x;e), and the agent’s oppor- 
tunity utility) a test of the model would be com- 
paratively straight forward. Researchers could 
directly calculate the optimum reward schedule 
and the optimum effort level. If the actual re- 


ward schedule and the actual effort level turned 


out to be the same as the optimum levels calcu- 
lated by researchers, then it could be concluded 
that the theory had been corroborated. Other- 
wise it would be necessary to conclude that the 
theory had been falsified. In reality, however, 
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most of the information needed to conduct such 
a “direct” test will not be available. This situation 
is familiar to anyone with experience in testing 
economic theories. For example, the testing of 
the theory of consumer behaviour raises similar, 
but less severe, difficulties (see e.g. Blaug, 1980, 
chapter 6). Given knowledge of an individual's 
utility function, income and consumption 
choice, researchers could perform a direct test 
of the theory. In reality it is rarely possible to ob- 
serve an individual’s utility function, so such 
direct tests are rarely possible. 

To overcome the problem of observability, 
economists make use of the method of compara- 
tive statics. Using this method it is often possible 
to generate qualitative predictions about the di- 
rection of change in some endogenous variable 
following a change in an exogenous variable. For 
example, one can produce qualitative predic- 
tions about the response of an individual’s con- 
sumption bundle following a change in market 
prices and/or the individual’s income. This com- 
parative statics approach involves three main 
steps (for details see Sielberberg, 1978): 

(1) the specification of a model involving at 
least one endogenous and one exogenous vari- 
able: 

(2) the derivation of the equilibrium condi- 
tions that will be obtained if the values of the 
exogenous variables are held constant; 

(3) the perturbation of the equilibrium condi- 
tions with respect to small changes in the values 
of each exogenous variable in turn, with the ob- 
jective of identifying the qualitative direction of 
change in the endogenous variables. 

It is these comparative static propositions 
which constitute the empirically testable con- 
tent of most microeconomic theories. For 
example, a testable hypothesis of the theory of 
the firm under perfect competition is that a firm 
will produce more ofa good if its price rises. The 
familiar marginal cost equals marginal revenue 
proposition is an equilibrium condition which is 
not testable. 

Some of the practical difficulties in devising 
tests of the comparative static implications of an 
economic theory have been referred to above 
but, in addition to these, special difficulties arise 
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in attempts to apply the comparative static 
method to agency theory. In the basic agency 
model reviewed above the exogenous variables 
of the model are the parameters of the princi- 
pal’s and agent’s utility function, the beliefs of 
the two parties, and the parameters of the func- 
tion f(e). The endogenous “variables” are the 
level of effort and the reward function. The 
reader may have already noticed two special dif- 
ficulties which do not arise with the more con- 
ventional theories of the firm and the consumer. 
First, since the model assumes that the principal 
cannot observe the agent’s effort level, it is only 
sensible to assume that the level of effort will 
also be unobservable by the researcher. Second, 
whilst the reward function may be observable, 
the fact that the main prediction of the theory is 
represented in the form ofa function rather than 
in the form of the value of some real variable 
raises conceptual difficulties in applying the 
comparative static method. In particular how 
can one represent changes in the reward func- 
tion? One possible answer to this question will 
be examined in the next section. 

When examining the testability of agency 
theory it is necessary to make some assumptions 
about the researcher’s information set, For the 
most part the discussion in the next section will 
assume that the researcher will be able to ob- 
serve the agent’s actual reward, the agent’s per- 
formance and any other information observed 
by both the principal and the agent (including 
any messages sent from one party to the other). 
It will also be assumed that the researcher can 
observe the terms of the reward contract be- 
tween the principal and the agent. We will refer 
to this information as the researcher’s primary 
information set. The analysis will argue that 
many of the major propositions of agency theory 
cannot be tested on the basis of the primary in- 
formation set alone. In these cases we attempt to 
spell out the additional information required for 
a critical test. In practice the researcher may be 
unable to observe all the information in the 
primary information set. For example, it is not 
obvious how to interpret the term “perform- 
ance” in the context of, say, a company execu- 
tive. Also in many realistic situations there may 
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be no explicit reward contract or at least many 
features of the contract will be implicit. Such 
problems further limit the prospects for con- 
` ducting a critical test. 

Difficulties also frequently arise at the third 
and final stage of the testing process: interpret- 
ing the research findings. In practice the evi- 


dence is rarely unequivocal It is then a matter of 


judgement as to what amount of negative evi- 
dence is serious enough to warrant a theory’s re- 
jection (witness, for example, the controversy 
over the efficient markets hypothesis). Second, 
situations can arise where some of a theory’s im- 
_ plications appear corroborated whilst others ap- 
pear to be rejected. In such cases it may be possi- 
ble to “salvage” a theory by carefully delineating 
its domain of applicability. Finally, none of these 
matters can be judged in isolation from compet- 
ing theories. The ultimate value of a theory 
stems from its ability to predict phenomena that 
other theories are unable to predict. 


ON THE FALSIFIABILITY OF AGENCY THEORY 


This section examines the empirical testabil- 
ity of agency theory. The discussion begins by 
focusing on the empirical testability of the basic 
agency model. This is followed by a discussion of 
models involving pre-decision information, i.e. 
models in categories (i) and (ii). The final sub- 
sections consider the empirical testability of 
models involving post-decision information, i.e. 
models in categories (iii) and (iv). 


Testing the basic agency model 

The basic agency model gives rise to proposi- 
tions relating to the basic nature and shape of the 
reward function. Examples of propositions relat- 
ing to the basic nature of the reward function 
can be found in Harris & Raviv (1978) and 
Shavell (1979). Shavell (1979) derived the fol- 
lowing propositions from the basic model: 


Proposition A. If the agent is risk neutral his or her reward 
will equal the outcome minus a constant. 
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Proposition B. If the agent is strictly risk averse his or her 
reward will always depend to some degree on the payoff 
but he or she will never be left bearing all the risk. 


Proposition A is testable if and only if one can 
devise a method for identifying precise risk neut- 
rality on behalf of the agent. Furthermore it may 
be difficult to find a sample of risk neutral agents. 

At first sight proposition B would appear to be 
readily testable. Moreover it applies whenever 
the payoff is subject to risk and where the 
monitoring of effort is imperfect. Some addi- 
tional evidence (outside the primary informa- 
tion set) is needed on the agent’s risk attitude 
but only to the extent necessary to establish that 
the agent is not risk neutral or risk seeking. How- 
ever, a major difficulty arises with tests of prop- 
osition B once one admits the possibility of 
transaction costs. Since contracts which involve 
either the principal or the agent receiving d fixed 
payoff are structurally simpler than contracts in- 
volving risk sharing, an observation that one of 
the parties receives a fixed payoff, which con- 
tradicts proposition B, can always be attributed 
to transaction costs. Once transaction costs are © 
admitted into the model, therefore, proposition 
B becomes effectively unfalsifiable.* 

General empirical propositions relating to the 
shape of the reward function have proved rather 
more elusive even for the basic agency model. In 
general there is no guarantee that the reward 
function will even be monotonically increasing 
(Holmstrom, 1979; Grossman & Hart, 1983). 
The following rather weak general proposition 
has been established: f 


Proposition C. It is never optimal for the agent’s reward 
function to be everywhere decreasing in x. 


This proposition is readily testable but not ter- 
ribly interesting. It simply rules out the possibil- 
ity of any incentive system which pays an agent 
more for producing less. Further restrictions on 
the basic model are required to generate more 
interesting propositions. For example 
Holmstrom (1979) has established the follow- 
ing proposition: 


í A contract under which the agent bears al] the risk may be the Pareto optimum contract once transactions are taken into 


account 
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Proposition D. The agent’s reward function will be mono- 


: tonically increasing in x if and only if 6,2, e) fxe) is in- 
creasing in x.’ 


In order to test this proposition one must be 
able to check whether the monotone likelihood 
ratio condition holds. This requires detailed in- 
formation about the stochastic relationship be- 
tween effort and output which is not present in 
the primary information set. In particular, the 
monotone likelihood ratio condition cannot be 
tested on the basis of time series observations of 
output because the likelihood ratio refers to the 
effect of a small change in the effort level from 
the equilibrium effort level whilst only equilib- 
rium output levels would be observed. In prac- 
tice, therefore, it may be very difficult to check 
the monotone likelihood ratio condition. 

An alternative approach towards testing impli- 
cations about the shape of the reward function is 
the comparative statics approach mentioned in 
the previous section. The main problem in ap- 
plying the comparative static approach is that 

‘the main agency-theoretic propositions are not 
about the level of some real variable, but about 
the shape of the reward function itself. One pos- 
sible way of overcoming this problem is to im- 
pose a priori a particular functional form on the 
shape of the reward function. For example, one 
might restrict attention to the set of linear re- 
ward functions (the budget based contracts of 
Demski & Feltham (1978) also lend themselves 
to this approach). Stiglitz was able to generate 
two powerful propositions: 


Proposition E. Ceteris paribus, the slope of the reward 
function decreases as the degree of agent risk aversion in- 
creases; 


Proposition F. Ceterts paribus, the slope of the reward 
function decreases as degree of risk increases. 


Such comparative static propositions are test- 
able, in principle, as follows. First, select a large 
sample of companies. Then, for each company, 
estimate the degree of sensitivity of (say) the 
chief executive's total compensation to the com- 
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pany’s operating performance. Then question 
each executive with a view to identifying his or 


her degree of risk aversion. In addition estimates 
would be required of the riskiness of the com- 
pany’s operations. Given all this information a 
test for a negative association between sensitiv- 
ity and risk, and a negative association between 
sensitivity and risk aversion would be possible. 

There are, however, a number of difficulties 
likely to be encountered in attempts to perform 
such tests. First, the assumption of a linear re- 
ward function may be inappropriate. Second, 
there may be practical difficulties in estimating 
the executive’s total compensation (a complete 
analysis would require all forms of compensa- 
tion both pecuniary and non-pecuniary to be in- 
cluded, and the executive’s tax position may also 
be relevant). Third, there is the question of how 
firm performance should be measured (should 
accounting or market based measures be used, 
for example). Fourth, there is the problem that 
the cross-section of chief executive may differ in 
several important respects which influence their 
reward function. For example they may have dif- 
ferent opportunity utility levels, different re- 
sponsibilities, and differing amounts of influence 
over the firm’s operating performance. Finally, 
and most worrying of all, there is the possibility 
that there may be a degree of matching of risk av- 
erse managers with low risk firms, i.e. there may 
be a self selection problem. It is at least conceiv- 
able that the least risk averse managers will tend 
to be employed by the riskier firms. If this effect 
does occur the Stiglitz propositions may become 
practically untestable because of collinearity be- 
tween the two independent variables. 


An aside on the usefulness of laboratory studies 

For sake of completeness mention shoúld also 
be made of the possibility of using laboratory 
studies to test the propositions of the „basic 
agency model. A paper by Berg et al. (1984) re- 
ports the results of one such test. The experi- 
ment was designed to be performed in two 
stages. In the first stage all the student subjects 


5 Holmstrom (1979) refers to f(x e yfe) as the likelihood ratio. The condition that requires f(2,¢V/f(%e) to be increasing 
in x, is referred to as the monotone likelibood ratio condition. 
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were trained as agents. In the second stage all the 
subjects were trained as principals and then ran- 
domly ‘assigned to principal/agent pairs. Stage 2 
was then repeated for ten separate trials with 
new pairs beirig chosen for each trial. The au- 
thors used a new experimental technique de- 
signed to induce specific risk preferences in the 
principal and the agent. In effect the technique 
ensured that the principal would behave as a risk 
neutral expected payoff maximiser and the 
agent would behave in a risk and effort averse 
manner with the following utility function: 


81.5975 + 75 — 2.7e? 


where s denotes the agent’s reward and e de- 
notes the agent’s effort level. In addition to the 
assumptions underlying the basic agency model 
Berg et al. introduced three other restrictions 
on the choice situations underlying their test. 
First, it was assumed that only two levels of out- 
put were possible. Second, the agent was re- 
stricted to choose between one of two alterna- 
tive effort levels. Third, the principal was re- 
stricted to choose one from three alternative re- 
ward functions. These three alternative reward 
functions included the optimum solution to the 
principal’s problem (contract 1), a constant re- 
ward function (contract 2) which involved no 
risk sharing by the agent, and a sub-optimal re- 
ward function which imposed some risk on the 
agent but not in an optimum manner (contract 
3). Separate tests were also performed for 
scenarios in which the principal could observe 
the agent’s effort level. In this case the choice set 
of the principal was restricted to contract 1, con- 
tract 3 and an optimum forcing contract under 
which the agent received the same reward as 
under contract 2 if he supplied the higher level 
of effort, otherwise a severe penalty would be 
imposed upon the agent. 

For the case in which the principal could not 
observe the agent’s effort level, 100% of pairs 
correctly chose the optimum contract from trial 
8 onwards whilst only a small minority chose 
sub-optimum contracts in trials 1—7. For the case 
in which the principal could observe the agent’s 
effort level about 80% of pairs correctly chose 
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the optimum forcing contract after trial 8 and 
about 65% chose correctly in trials 1—7. 

As a critical test of agency theory the Berg et 
al. approach suffers from several limitations. 
First, there is the question of the research design. 
The design was constructed to make it very easy 
for the student subjects to perceive the effect of 
their choices on their payoff. Moreover the 
probabilities in the model were represented pic- 
torially by segments of a circle. Thus the 
mathematical structure of both the principal's 
and the agent’s choice problem was given to the 
subjects in a particularly simple form. In reality a 
large part of the principal’s task may involve the 
recognition and structuring of the underlying 
problem. By presenting the principal’s problem 
in such a straightforward from Berg et ai im- 
plicitly assumed all such difficulties away. 
Second, by restricting the action set of the prin- 
cipal to just three contracts (one of which was 
pretty obviously sub-optimal) the Berg et al test 
significantly understated the demands that 
realistic principal choice problems place on the 
calculating abilities of the principal. In reality 
the principal’s task involves the selection of an 
optimum reward function from an infinite num- 
ber of alternatives. Third, there are difficulties in 
interpreting the results of such tests. In particu- 
lar the results of the tests for the case in which 
the principal could observe the agent’s action 
raised some grounds for doubts about the de- 
scriptive validity of the basic model. Berg et al 
concluded that their results decisively rejected 
the following null hypothesis in favour of the 
predictions of the basic model: 


HO. There is no systematic choice of contract by the prin- 
cipal or action by the agent. 


Since only 1 in 6 pairs would make the correct 
action and contract choice if HO was true, evi- 
dence that about 80% of pairs chose correctly is 
obviously clear evidence against HO. However, 
HO is a very weak null hypothesis, especially 
once one bears in mind the simplifications intro- 
duced into the research design by the resear- 
chers. In effect the null hypothesis of the Berg et 
al, test can be restated as follows: 
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HO. Given a situation where the principal is limited to 
choosing from three alternative contracts, the agent is li- 
mited to choosing between two effort levels, and where 
the structure of the problem has been fully explained to 
both parties, there will be no systematic choice of con- 
tract by the principal or action by the agent. 


When stated in this way the “high” success 
rate reported by Berg et al is not particularly 
surprising. Indeed probably the only surprising 
aspect of their results is that some 20% of pairs 
failed to select the optimum forcing contract 
even by the tenth trial. Hence, when viewed 
from this perspective, it is unclear whether the 
results of Berg et al. should be interpreted as evi- 
dence for or against the basic model. 


Testing agency models with pre-dectsion infor- 
mation 
The discussion of this section so far has 
focused entirely on issues relating to the testabil- 
ity of the basic agency model in which the only 
information observed by the principal and the 
agent is the final payoff itself. The remainder of 
this section discusses the testability of agency 
models in which either the agent or both the 
principal and the agent receive additional infor- 
mation. This subsection discusses the testability 
of models in which information is received 
before the agent selects his or her effort level. 
The remaining subsections focus on models in 
which information is received after the agent 
selects his or her effort level. In terms of the 
taxonomy presented in Fig. 2 above, this subsec- 
tion discusses models in categories (i) and (ii) 


and the remaining subsections focus on models: 


in categories (iii) and (iv). 

The pre-decision information agency litera- 
ture has focused on two main questions. First, 
under what circumstances will the introduction 
of pre-decision information result in a Pareto im- 
provement? Second, will the principal always be 
able to design a pre-decision information system 

‘to yield a Pareto improvement if he or she has 
costless and complete control over the charac- 
teristics of the information system? 

With regard to the first question, the literature 
has shown that the value of pre-decision infor- 
mation is situation specific. It is possible to con- 
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struct agency situations in which the introduc- 
tion of pre-decision information leads to a Pareto 
improvement and it is also possible to construct 
agency situations in which the introduction of 
pre-decision information leads to a strict Pareto 
deterioration (e.g. see Christensen, 1981 and 
1982). From a normative standpoint these find- 
ings are interesting because they alert the princi- 
pal to the possibility that the introduction of pre- 
decision information could make him or her 
worse off or better off, depending on the precise 
details of the agency situation. However, from 
the viewpoint of an empirical researcher, these 
findings are devastating for they imply that the 
model yields.no empirically testable proposi- 
tions with regard to the value of pre-decision in- 
formation. 

If such models are to become testable it will 
be necessary to establish empirically verifiable 
conditions which are either necessary for a pre- 
decision information system to have positive 
value or sufficient for a pre-decision information 
system to have negative value. For example, if 
we had a set of conditions necessary for a pre-de- 
cision information to have positive value, we 
could test propositions of the form “All pre-deci- 
sion information systems chosen by the princi- 
pal will satisfy these conditions.” Similarty, if we 
had a set of conditions sufficient for a pre-deci- 
sion information system to have negative value, 
we could test propositions of the form “No pre- 
decision information system chosen by the prin- 
cipal will satisfy these conditions.” 

Baiman & Evans (1983) established a suffi- 
cient condition for a pre-decision information 
system to have a strictly positive value. How- 
ever, whilst this was an important normative 
achievement, such propositions are not testable 
on readily observable data. To falsify the Baiman 
& Evans proposition one would need to be able 
to observe the information systems not used by 
the agency. In particular their proposition could 
be falsified only by observing a costless pre-deci- 
sion information system which satisfied their 
sufficient condition and yet was not in use by the 
agency. l 

With regard to the second question, Baiman & 
Evans (1983) established that a principal will al- 
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ways be able to achieve at least a weak Pareto im- 
provement if he or she has costless control over 
the design of the pre-decision information sys- 
tem. Under more stringent assumptions Penno 
(1984) has shown that the principal will always 
be able to design a pre-decision information sys- 
tem to yield a strict Pareto improvement. 

These are both important normative findings 
but, as they stand, their empirical content is se- 
verely restricted. For example the empirical 
content of Penno (1984) can be expressed as 
follows: 


Proposition G. Given the technical conditions specified 
by Penno (1984) and given costless control over the de- 
sign of the pre-decision information system, the principal 
will atways be able to design a pre-decision information 
system to yield a strict Pareto improvement. 


Clearly the domain of application of this prop- 


osition is severely restricted as it only applies to 
agency situations in which the principal has 
costless and complete control over the design of 
the pre-decision information system. 

Finally mention should be made of agency 
models with private pre-decision information in 
which communication between the agent and 
the principal is possible. The following empirical 
propositions have been established for such 
models: 


Proposition H. In models with private pre-decision infor- 
mation, communication is of no social value if the agent 
has perfect pre-decision information (see Baiman & 
Bvans, 1983). (This follows because the principal can 
infer all he or she needs to know about the agent’s private 
information from the final gross payoff if he or she knows 
that the agent has perfect information. ) 


Proposition I. In models with private pre-decision infor: 
mation, a necessary and sufficient condition for com- 
munication to be strictly valuable is for the honest revela- 
tion of the agent’s information to be strictly valuable. 


The more general of these two propositions is 
proposition I. This proposition establishes that 
the class of private information systems for 
which communication would be valuable is 
identical to the class of private information sys- 

- tems for which the publication of the private in- 
formation would be valuable. The possibility of 
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testing this proposition, therefore, awaits the de- 
velopment of a set of observable conditions for 
the public reporting of private information to be 
socially valuable. 

Proposition H is directly testable for those cir- 
cumstances where the agent has perfect pre-de- 
cision information. Clearly the-domain of appli- 
cation of this result is severely limited. 


Testing agency models with unconditional 
post-deciston information 

The majority of the agency literature on post- 
decision information is concerned with cate- 
gory (iv) models in which the decision to ac- 
quire information is independent of the realised 
payof— Agency models in this category have 
been used to investigate two main issues: first, to 
discover the precise conditions under which the 
introduction of public post-decision informa- 
tion will result in a strict Pareto improvement; 
second, to discover general conditions under 
which one post-decision information system 
will be strictly Pareto preferred to another. 

Holmstrom (1979) discovered precise condi- 
tions under which the introduction of public 
post-decision information will yield a strict 
Pareto improvement. He established the follow- 
ing general proposition: 


Proposition J. A post-decision information system will 
result in a strict Pareto improvement if and only if the fol- 
lowing condition is false 


NGI p2i€) = B(4Hp2). Ge) for alle. (j) 

Here x and y are to be interpreted as the 
realised output and some other post-decision in- 
formation signal respectively. The function 
A%Pp2ie) is the joint probability density function 
of x and ypz conditional on e. Condition (j) states 
that if f(%7p)2;e) can (cannot) be factored into 
two component functions of the form g(%yp2) 
and 4(2;e) then the observation of y,2 is value- 
less (strictly valuable). 

The following proposition is an implication of 
proposition J: 


Proposition K. It is never desirable to offer the agent an 
incentive scheme which makes his or her payment condi- 
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tional on a particular output, a lottery rather than a cer- 


tain reward.® 


Shavell (1979) in addition to establishing a 
proposition similar to J also established the fol- 
lowing proposition: 


` Propostion L. If the agent is risk neutral post-decision in- 


formation is valueless. 


Proposition K and the “only if’ part of propos- 
ition J are only testable if one can check the truth 


value of condition (j). For example the identifi- ` 


cation of an information system in use for which 
condition (j) is true would falsify the “only if” 
part of proposition J. Unfortunately it is unlikeły 
that an independent observer would be able to 
check the truth value of condition (j ). In particu- 
lar note that condition (j) must hold for all val- 
ues of e including those levels of e which are sub- 
optimal An independent observer may be able 
to observe the realised values of x and y but 
these realisations will only relate to the optimal 
level of e To check the truth value of (j) the ob- 
server would also need to know what values of x 
and y would have occurred if sub-optimal levels 
of e had been chosen. 

Even greater difficulties arise in testing the “if” 


part of proposition J because, in addition to the’ 


requirement to.check the truth value of condi- 
tion (j), one would aiso need to be able to ob- 
serve costless information systems which were 
available to the agency but not being used. Note 
that the “if? part of proposition J states that a 
costless information system will be valuable 
whenever condition (j) is false. Hence to falsify 
the if part of proposition J one would need to 
identify a costless information system which was 
not being used for which condition (j ) was false. 
Proposition L is testable if one. can find a 
means of verifying precise risk neutrality on be- 
half of the agent. Proposition L would be falsified 
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if one could discover instances of a principal 
making use of post-decision information in as- 
sociation with a risk neutral agent. 

Gjesdal (1981) and Grossman & Hart (1983) 
focused on the identification of conditions 
under which alternative public post-decision in- 
formation systems can be ordered as to value. 
Gjesdal established the following proposition: 


Proposition M. Information system N will be strictly 
more Valuable than information system N’, independent 
of the beliefs and preferences of the two parties, if there 
exists a p*p’ markov matrix, W, such that 


AN(se) = A(se).W. (m) 


Condition(m) is also a necessary condition for N to be 
Pareto superior to N’, independent of the beliefs and pre- 
ferences of the two parties, if the two information sys- 
tems are noiseless. 


Here p is the number of signals receivable 
under information system N and p’ is the num- 
ber of signals receivable under information sys- 
tem N’. A is the likelihood matrix of information 
system N. The matrix A has p columns corres- 
ponding to the p signals receivable under N. 
Similarly the matrix A’, the likelihood matrix of 
information system N’, has p’ columns. The 
number of rows in both A and A’ is equal to $ 
times E where S is the number of output relevant 
events’ and £ is the number of possible effort 
levels (Æ is assumed to be a finite integer). The 
elements of A and A’ contain the conditional 
probabilities ofa particular signal being received 
conditional on each effort level/output relevant 
event pair. This can be illustrated by the follow- 
ing example of a likelihood matrix. 


Signall Signal 2 
Output relevant event 1 effort level 1 0.5 0.5 
Output relevant event 2 effort level 1 0.6 0.4 
Output relevant event 1 effort level 2 0.3 0.7 
Output relevant event 2 effort level 2 0.2 0.8 


€ In other words only information relevant for inferring the agent's effort level should enter the reward function. The 
introduction of other “noise” elements which provide no additional information about the agent’s effort level into the reward 


function can never result in a welfare gain. 


7 Let Z represent the set of all possible states of the world. A set of output relevant events is any partition of Z into a finite 
number of events such that for any two states belonging to the same event the output is the same in those two states whatever 


the chosen level of effort. 
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Here p is equal to 2 as also are E and S£ Thus 
there are just two columns and four rows. The 
0.6 in row 2 column 1 indicates that the proba- 
bility of signal 1 conditional on output relevant 
event 2 and effort level 1 is 0.6. 

The testability of proposition M is plagued by 
the same fundamental difficulty which sur- 
rounds the testability of proposition J. In particu- 
lar the matrices A and A’ include rows for all ef- 
fort levels whilst only the outcomes relating to 
the optimal effort level will normally be observa- 
ble. 

The problem of verifying a sufficient statistic 
condition also arises with respect of Dye’s cate- 
gory (iii) model, ie. models with private post- 
decision information and communicatian. 

Dye (1983) establishes the following proposi- 
tions for the social value of communication in 
such models: 


Proposition N. In the context of a model in which the 
agent recetves private information after selecting his or 
her effort level, communication will be of no social value 
if x is a sufficient statistic for (247,.) with respect to e(y,2 
is the agent’s private information signal ) 


Proposition O. In the context of a model in which the 
agent receives private information after selecting his or 
her effort level, communication will be strictly valuable if 
Ci)  Yaisasufficient statistic for (%y,2) with 
respect toe. 
B(x/y,2) is increasing in y. (for a given e). 
[B(x/y,2) denotes conditional expectation. | 
In the absence of information the optimum 
effort level would be strictly positive and the. 
optimum reward function would be strictly 
increasing in x. 


(ii) 
(it) 


From an empirical point of view both of these 
propositions run into the general problem raised 
above. In order to test these propositions one 
must be able to verify a sufficient statistic condi- 
tion. This condition must hold for all possible 
values of e including the sub-optimal values of e 
which are never selected in equilibrium. 


Testing Holmstrom’s theory of relative 
performance evaluation 

The basic agency model was extended by 
Holmstrom (1982) to incorporate multiple 
agents. The novel feature of the Holstrom 
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(1982) model is that it allows the reward of one 
agent to depend not only on his or her own per- 
formance but also on the performance of other 
agents. In particular when the exogenous uncer- 
tainties which influence the performances of 
several agents are mutually correlated then the 
observed performance of one agent provides in- 
formation relevant for assessing the perform- 
ance of others. Holmstrom provided a formal 
proof that relative performance evaluation can 
be socially valuable. Holmstrom (1982) estab- 
lished the following proposition. 


let x{e, 8;) denote the performance of agent / under ef- 
fort level e; if the state of the world is 6, (here 0, is a ran- 
dom factor reflecting the state of the world in so far as it 
affects the performance of agent £). À 
Then: 


Proposition P. If the xs are monotonic in 9, then the opti- 
mal reward function of agent / will depend on x, alone if 
and only if the x;s are statistically independent. 


Recent papers are Kunkel & Magee (1984) 


_ and Antle & Smith (1986) present interesting 


empirical tests of proposition P. This subsection 
considers both of these papers in some detail as 
they provide concrete illustrations of the need 
for caution when interpreting the results of tests 
of “if and only if’ propositions such as proposi- 
tion P. 

Kunkel & Magee (1984) attempted to test 
proposition P. To this end the authors selected a 
sample of companies and, for each company in 
their sample, estimated a time series regression 
of the following form: 


PCy = a; + b DROE; + b2,ADROE,, + uy 


where PCit = percentage change in salary plus 

bonus of the chief executive officer of firm # in 

year t; 

DROE, = change in the return on equity 

employed of firm 7 in year $ 

ADROE,, = average change in return on equity in 

year ¢ of the firms in the same industry as firm f; 

t = a conventional OLS disturbance term; 

and a, bip 62, are parameters to be estimated. 
According to the basic agency model, the 
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coefficient b, should be positive for all firms. For 
the 86 firms in their sample Kunkel & Magee re- 
ported positive b, estimates for 76 firms and 10 
negative (but statistically insignificant) values. 
These results are clearly consistent with the 
basic agency model. 

With regard to relative performance evalua- 
tion, proposition P implies that the coefficient b, 
should be negative if but only if DROE and 
ADROE are’ significantly positively correlated. 
For their sample of 86 firms Kunkel & Magee re- 
ported negative estimates of b, for 48 firms. 
However, in order to assess whether these 
results are consistent with proposition P, it is 
also necessary to know how many of the 38 firms 
with positive b, values exhibited negative corre- 
lation between DROE and ADROE. Proposition P 
would be supported if all the firms with positive 
correlation between DROE and ADROE had 
negative b, values and if all the firms with nega- 
tive correlation between DROE and ADROE had 
positive b, values. Unfortunately Kunkel & 
Magee only presented such detailed results for 
one particular industry involying eleven firms. 
All eleven of these firms exhibited significant 
positive correlations between DROE ‘and 
ADROE and so, according to proposition P, all 
eleven of these firms should have a negative bz 
coefficient. In fact only two of the reported esti- 
mated values of b, were negative and statistically 
significant whilst six values were negative but in- 
significant and three values were positive but in- 
significant. The interpretation of these results-as— 
a test of agency theory is ambiguous. On the one 
hand it could be argued, as did Kunkel & Magee, 
that since none of the b, values are significantly 
positive whilst two values are significantly nega- 
tive the results are consistent with proposition 
P. On the other hand one could point out the 
proposition P is an “if and only if’ proposition. In 
particular the “only if? part of proposition P 
states that the reward of the agent will depend 
on x; alone only if the xs are statistically inde- 
pendent. Hence from eleven firms exhibiting 
significant positive correlation between DROE 
and ADROE one would expect most of these to 
exhibit statistically significant (negative) b, val- 
ues. The fact that only two values were signifi- 
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cant might be interpreted, therefore, as evi- 
dence against proposition P. 

In an impressive and methodologically 
thoughtful paper Antle & Smith (1986) pro- 
vided an alternative test of proposition P. For 
each firm in their sample their test involved a 
comparison of two time series regressions. The 
first regression was expressed as follows: 
MC, = a, + b,ROAg + CRET p + ey (A) 
where MC,, is the current income equivalent of 
the chief executive of firm s in year ¢ (inflation 
adjusted ); 

ROA,, is the accounting rate of return on assets ` 
employed in firm s in year f; 

RET,, is the market return on the equity of firm s 
in year $; 

eş is a conventional disturbance term; 

and a, b, and c, are parameters to be estimated. 


Their second regression was expressed as fol- 
lows: 


MCa = a, + bip SROAg, + bas UROAg + Cp SRET ņa 
+ C2. URET,, + Cy : (B) 


where SROA and UROA represent the “systema- 
tic” and “unsystematic” components of ROA re- 
spectively, and SRET and URET represent the 
“systematic” and “unsystematic” components of 
RET. : 

The-systematic-component of ROA was esti- 
mated by regressing ROA on a weighted average 
of the rates of return of firms in the same indus- 
try. The residuals from this regression were then 
adopted as estimates of UROA,, and SROA,, was 
set equal to ROA,, minus UROA, An analogous 
procedure was used to estimate SRAT,,. 

To test proposition P, Antle & Smith (1986) 
performed an F test designed to test whether 
model (B) yields a significantly better fit to the 
data than model (A). For their sample of 39 firms 
the goodness of fit was improved (at the 10% 
level) in 16 cases. 

Antle & Smith interpreted these results as 
being consistent with Holmstrom’s theory of re- 
lative performance evaluation. However, whilst 
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they are correct in drawing the conclusion that 
some firms act as if they employ relative per- 
formance evaluation, they have not, strictly 
speaking, provided a critical test of proposition 
P. To test proposition P one needs to test both 
the “if? and the “only if” part of the proposition. 
Proposition P predicts that relative performance 
evaluation will be used whenever the perform- 
ance measures of agents are statistically depen- 
dent. Hence, before a researcher can say 
whether the finding that model (B) yields a bet- 
ter fit than model (A) in 16 out of 39 cases is sup- 
portive of proposition P, the researcher must 
first estimate how many of these 39 cases ought 
to have provided a better fit if the theory is cor- 
rect. Now table 2 of Antle & Smith indicates that 
75% of their sample exhibited an R? of 44% or 
more in the regression of ROA on the industry 
weighted average ROA. This suggests that about 
29 of the 39 firms should have exhibited a better 
fit of model (B) over model (A). Looked at in this 
light the finding that only 16 firms exhibited a 
better fit might be interpreted as evidence 
against proposition P. At best the interpretation 
of their findings with respect to proposition P is 
equivocal 

A critical test of proposition P must show that 
firms use relative performance evaluation in 
situations where, according to the theory, they 
ought to use it, and that they do not use relative 
performance evaluation where, according to the 
theory, they ought not to use it. The two papers 
reviewed in this section provide convincing evi- 
dence that many firms use relative performance 
evaluation but they do not provide a critical test 
of proposition P. If anything, their evidence 
suggests that fewer firms use relative perform- 
ance evaluation than would be predicted by 


proposition P. 


Testing agency models with conditional 
post-decision information l 

If post-decision information is costly to ac- 
quire it may be sensible to make the decision of 
whether to acquire such information condi- 
tional on the realised payoff. 

This possibility was first analysed by Baiman & 
Demski (1980a) who extended the basic agency 
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model by assuming that a post-decision informa- 
tion signal can be observed for an expenditure of 
a fixed amount, K and that the decision to ac- 
quire this information can be taken after the 
payoff has been observed. They established the 
following proposition: 


Proposition Q. Pareto optimal investigation strategies are 
pure. 


Here an investigation strategy means a 
mathematical function relating the probability 
of post-payoff investigation to the realised 
payoff. A pure investigation strategy is one for 
which the probability of investigation is either 
zero or one for every possible level of the 
realised output. A mixed investigation strategy is 


` one for which the probability of investigation is 


greater than zero but less than one for some 
realised output levels. If we let a(x) stand for the 
optimal investigation strategy then proposition 
Q implies that a(x) is zero or one for all values of 
x 

The Baiman & Demski paper was based on all 
the assumptions of the basic agency model in- 
cluding the assumption that the agent does not 
observe any private information before selecting 
his or her effort level. Kanodia (1985) analysed 
a model similar to the Baiman & Demski model 
except for the fact that he assumed that the agent 
would also be able to observe private pre-deci- 
sion information. He showed that pure investiga- 
tion strategies may no longer be Pareto optimal 
once one allows the agent access to private pre- 
decision information. From a normative 
standpoint this is an interesting result because it 
informs the principal that it may be possible to 
achieve a Pareto improvement by considering 
mixed investigation strategies. From an empiri- 
cal point of view Kanodia’s findings rule out one 
possible test of agency theory. If Kanodia had 
also found that all Pareto optimal investigation 
strategies were pure this would have provided a 
falsifiable proposition since the observation of 
any mixed strategy would count as contrary evi- 
dence. The finding that investigation strategies 
can be pure or mixed means that information on 
this aspect of the investigation strategy provides 
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no evidence on the validity of the theory. , 
Baiman & Demski (1980b) examined issues 

relating to the nature of the optimal investiga- 

tion region under assumptions which ensured 


that the optimal investigation strategy would be 


pure. The authors defined the optimal investiga- 
tion region as those values of x under which in- 
vestigations would occur. They also adopted a 
number of simplifying assumptions in addition 
to those underlying their 1980a paper. In par- 
ticular they assumed that the distributions of x 
and ypz (the post-decision information signal) 
conditional on e are statistically independent, 
Le. 


K~UIpr2e) = b(2G€).BC p21). (r) 


Under this additional assumption they were . 


able to show that the optimal investigation reg- 
ion would be one tailed or two tailed depending 
on the agent’s degree of risk aversion. In particu- 
lar they established the following proposition: 


Proposition R. Let K be sufficiently small so that a non-tri- 
vial investigation region is optimal in the reduced investi- 
gation model. Then the optimal investigation region is 
(i) lower tailed only ify < “ory> 1, 

(i) upper tailed only if/z<y< 1, 

iil) independent of x ify = 1. 


Here y is a measure of the agent’s risk toler- 
ance. A cross-section test of this proposition 
would be possible if several agency relation- 
ships, where condition (r) was known to hold at 
least approximately, could be observed.’ In ad- 
dition the researcher would need to be able to 
verify the shape’ of the investigation region and 
estimates of the agent’s risk tolerance paramet- 
ers would also be required. Given this informa- 
tion the null hypothesis that the agent’s risk to- 
lerance makes no difference to the shape of the 
investigation region could be tested against the 
alternative hypotheses that lower tailed investi- 
gation is more likely for estimated values of y 
less than half or greater than one. As a practical 
matter such a test may not be feasible since, as 
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noted by Baiman & Demski (1980b, p. 194): 


~. we may not observe high-tail investigation policies sim- 
ply because the people for whom such policies are opti- 
mal would not be in positions in which they are the ob- 
ject of the monitoring. 


Baiman & Demski (1980b) derived an import- 
ant corollary to proposition R which offers the 
prospect of a relatively straightforward test of 
their theory: 


Proposition S. The optimal incentive functions in the 
basic agency model with conditional investigation are 
such that: i . 

@ Kx> | Axyglexiy foe all x ify <0, ory > 1, 
GD I< f Ky). ge dy for all x if0 < y< 1. 

(ili) 1) = | Joxy).gGre dy for all x ify = 0. 


Here (æ) is the reward function if no investi- 
gation takes place and J(x,y) is the reward func- 
tion ifan investigation takes place. Each of the in- 
equalities in (i) — (iii) compares the reward 
under /(.) at a given level of x with the average 
reward which the agent would receive under x if 
an investigation took place whenever x occur- 
red. For example, inequality (i) states that, for a 
given x the average reward if an investigation 
takes place whenever x occurs will be strictly 
less than the reward under x is no investigation 
takes place. 

From the perspective of empirical testability, 
the most interesting feature of proposition S is 
that the set of values of y for which inequality (i) 
holds is a proper subset of the values of y under 
which the optimum investigation region is 
lower tailed. This implies that proposition S can 
be tested even in situations where only agency 
relationships with lower tailed investigation are 
observed. In particular, given estimates of y fora 
cross-section of agency relationships with lower 
tailed investigation, one can test the hypothesis 
that inequality (i) is more likely to hold for esti- 
mated values of y less than zero and greater than 
one, and inequality (ii) is more likely to hold for 
estimated values of y between zero and one half. 


8 Lambert (1985) has shown that proposition R may no longer hold when condition (r) is violated. 


9 That is whether the region was upper tailed, lower tailed, or independent of x. 
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There would seem to be definite scope here for 
a test of the Baiman & Demski theory. 


CONCLUDING REMARKS 


Agency theory has contributed a great deal to- 
wards our understanding of incentive structures 
and the use of information in contractual situa- 
tions involving uncertainty. If nothing else, it has 
contributed a valuable conceptual framework 
for thinking about such issues. This paper in no 
way detracts from these important conceptual 
achievements. Rather its aim has been to focus 
on the status of agency theory as a positive 
theory of contractual relationships under uncer- 
tainty. 

The previous sections have attempted to high- 
light the main difficulties involved in testing the 
empirical propositions of agency theory. Some 
of these difficulties stem from the inherent com- 
plexity of agency theoretic propositions relative 
to the propositions of conventional micro- 
economics. For example, it is difficult to express 
propositions relating to the shape of the reward 
function in the form of comparative static prop- 
ositions. Other difficulties arise because most of 
the information needed to test agency proposi- 
tions will not normally be observable by the re- 
searcher. This difficulty is particularly serious 
with respect to agency theory because one of 
the central assumptions of the theory is informa- 
tion asymmetry. In particular, substantial dif- 
ficulties arise from the fact that the researcher 
will normally only observe the rewards and per- 
formances arising from equilibrium effort levels 
whilst critical tests of many agency propositions 
require information relating to the probability 
density function of output for all possible levels 
of effort including suboptimal effort levels. Fi- 
nally, ambiguities arise in the interpretation of 
the results of tests of agency propositions espe- 
cially “if and only if’ propositions such as prop- 
osition P. 


451 


With regard to the basic agency model, we 
have argued that the most promising direction _ 
for empirical research lies in attempts ‘to de- 
velop and test the comparative static predictions 
of the model. In particular further work is 
needed to develop testable comparative static 
propositions for (empirically) popular non- 
linear reward functions in addition to those al- 
ready developed for linear reward functions. De- 
tailed case studies of the reward structures of in- 
dividual executives and managers are also 
needed to complement the large scale statistical 
studies which, of necessity, are based on strong 
(sample) homogeneity assumptions. 

With regard to agency models with pre-deci- 
sion information, we have argued that, because 
of observability problems, such models are prac- 
tically devoid of empirical content. Penno 
(1984) has developed one, potentially testable, 
proposition but this proposition is severely re- 
stricted in its domain of application. The de- 
velopment of more generally applicable empiri- 
cally testable propositions awaits the develop-” 
ment of empirically verifiable conditions which 
are either necessary for a pre-decision informa- 
tion system to have positive value or sufficient 
for a pre-decision information to have negative 
value. ; 

With regard to agency models with post-deci- 
sion information, we have argued that the scope 
for empirical testing is likely to prove extremely 
limited due to the difficulty of verifying suffi- 
cient statistic conditions such as conditions (j), 
(m) and O(i). Self-selection difficulties may also 
render some propositions, such as proposition 
R, practically untestable. The most promising di- 
rection for immediate empirical research would 
appear to be further tests along the lines of Antle 
& Smith and Kunkel & Magee which take into ac- 
count the criticisms discussed above. Proposi- 
tions R and S may provide the basis for tests of 
agency models with conditional post-decision 
information. 
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Abstract 


Positive accounting research consists of tests of positive accounting theory; it is a substantial body of 

important accounting literature. The discussion in this paper is aimed at demonstrating that a problem in 

experimental procedure exists with such research that is created when positive accounting researchers 

translate the theoretical language of positive accounting theory into an empirical language that permits 

experimentation. Experimentally, positive accounting theory is transformed into a tautology. 

Experimenters define a causal variable in terms of the phenomenon they wish to explain with it. This 
transforms accounting practice into its own explanation, which is not quite the improvement in our 

understanding envisioned by positive accounting theory. In addition, the paper includes the demonstration 

that any plausible argument available to defend positive accounting research against the accusation that it 

creates a tautology requires that damage be done to the theory that informs such research, 


Positive accounting research (PAR) has pro- 
duced a sizeable body of literature (see Table 1 
for examples) consisting of reports on experi- 
ments investigating the accounting procedures 
choice problem. The researchers performing 
these experiments have been informed by what 
is called positive accounting theory (PAT); spec- 
ifically, the application of Jensen & Meckling’s 
(1976) agency theory to explain managers’ (and 
others) choices of accounting procedures. 
These choices may be either actual choices of 
procedure or preferences for procedures re- 
vealed through lobbying of standard setting 
bodies. 

The purpose of any accounting theory, ac- 
cording to PAT’s more prominent proponents 
(Watts & Zimmerman, 1986, p. 2), “... is to exp- 
lain and predict [emphasis in original] account- 
ing practice.” Positive accounting theory is an at- 
tempt to provide some explanation of account- 
ing practice rooted in the purposes of managers. 
Simply stated, these purposes of managers, as in- 
itially described by Watts & Zimmerman 





(1978), are reducible to economic self-interest. 


- That is, when choosing accounting procedures 


managers consider only the effect of the proce- 
dures on their wealth; choices are wealth- 
maximizing given the constraints imposed by 
other wealth-maximising agents (e.g., sharehol- 
ders, bondholders). Further, it is presumed that 
managers maximize their wealth if they choose 
those accounting procedures that maximize the 
value of the firm and/or maximize manager com- 
pensation via compensation agreements tied to 
accounting numbers. The list of wealth-affecting 
costs is familiar, e.g., political costs, bookkeeping 
costs, taxes, contracting costs. 

The type of study with which this paper is con- 
cerned is identified as positive accounting re- - 
search, since all studies of this type involve test- 
ing positive accounting theory. The purpose of 
this paper is to demonstrate that these PAR 
studies all suffer from a logical flaw in their de- 
sign which makes ascribing unambiguous mean- 
ing to the measures used to test the theory very 
problematic. The logical problem is created 


“Expressions of gratitude are due to Ed Arrington, Jon Bartley, Jere Francis, Katherine Frazier, participants at the accounting 
workshops at UNC-CH and Florida State University and two anonymous reviewers. The author also wishes to thank the 
Department of Economics and Business of North Carolina State University for providing the financial support for this project. 
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when the theoretical language of positive ac- 
counting theory is translated into the empirical 
language of PAR. The result of the translation 
process is to define self-interest in terms of ac- 
counting practice, the phenomenon self-interest 
is to explain. Experimentally PAT is made 
tautologous. Further, it will be demonstrated 
that avoiding that tautology requires that one of 
the two fundamental propositions of PAT be 
either false or undecidable. 


POSITIVE ACCOUNTING THEORY 


The demonstration that PAR has a problem of 
incoherency must begin with the development 
of the logic of PAT in an acceptable form. If the 
logical form of PAT is to be acceptable, it should 

‘be constructed from the blueprint explicitly 
_ given by the individuals who are currently con- 
structing and testing it; this is the procedure that 
will be followed. 

Watts & Zimmerman, as previously noted 


(1986, p.2), claim that positive accounting 


theory is to provide a scientific explanation of 
accounting practice. They contend that their 
view of theory is the view of theory in science; to 
quote them (1986, p.2), “The preceding view of 
theory, explicitly or implicitly, underlies most 
empirical studies in economics. It is also the 
view of theory in science (e.g. Poincaré, 1905; 
Popper, 1959; Hempel, 1965).” Whether this 
view is appropriate to accounting or the social 
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sciences in general is questionable; it is perhaps 
presumptuous to impose a set of generally ac- 
cepted scientific standards on accounting, 

The discussion in this paper is not to be taken 
as an endorsement of this view of theory; it is 
concerned only with demonstrating that a logi- 
cal incoherence exists within those experiments 
designed as tests of some PAT propositions. The 
status of PAT qua theory is not at issue. The 
theory itselfhas been criticized, e.g., Tinker et al. 
(1982), Lowe et al (1983), Schreuder (1985); 
the theorists have been criticized, e.g., Christen- 
son (1983); and the epistemology forming the 
basis of the theory has been criticized, e.g., by Ar- 
rington (1986) in accounting, McCloskey 
(1985) in economics, and Rorty (1979) in phil- 
osophy. It is apparent that in spite of such exten- 
sive criticisms, PAT still has the status of a seri- 
ous theory for a number of accounting resear- 
chers. But if that status is granted, the evidence 
thus far offered by positive researchers is flawed 
evidence. The focus of this paper is on explain- 
ing why such experimental evidence is of doubt- 
ful value, thus the view of theory held by 
positivist accounting researchers will be main- 
tained throughout. 

_ PAT is an explanation of accounting practice. 
According to Watts & Zimmerman (1986, p. 2), 
“(O)ur definition of accounting practice is 
broad.” How broad is not explicitly stated but 
most certainly broad enough to include the ac- 
counting procedures that may be observed to be 
in effect at some particular time and place. 


_ TABLE 1. Examples of PAR studies 








Authors Dependent variables Independent variables 
Watts & Zimmerman (1978) 1. Support or not support GPLA 1. Depreciation expense + mkt. value* 
2. Net monetary asset position + mkt. value* 
3. Sales X effect of GPLA on income* 
4. Sales + total sales in SIC group X GPLA effect” . 
5. Compensation plan: yes or no 
6. Regulated: yes or no 
Hagerman & Zmijewski(1979) 1. Inventory method 1. Totalassets* 
2. Depreciation method 2. Sales* 
3. Pension method 3. Beta 
4. ITC method 4. Fixed assets + sales* 
5. Concentration ratio’ 
6. Compensation plan: yes or no 


Dhaliwal (1980) 
` Salamon & Dhaliwal (1980) 


Bowen et aZ (1981) 


` Holthausen (1981) 


Leftwich etal (1981) 


Zmijewski & Hagerman (1981) 


Chow (1982) 


Dhaliwawl et al. (1982) 


Kelly (1982) 
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1. Frequency ofinterim reporting 


p 


. Oiland gas development 


cost method 
Disclosure of segmental data 


. Capitalization ofinterest 


. Income strategy 


. Hire external auditing 


. Depreciation method 


Reaction to SFAS #8 


L. 
2. 


1. 
2. 


vi 


MMN N me 


Sales ( control variable)* | 
Debt/equity* a 

Total assets* 

New capital issues: yes or no 


Compensation plan: yes or no 
Dividends + unrestricted R/E* 
Current interest- interest expense* 
Net tangible assets + long term 
Sales . z 
Unrestricted R/E: yes or no* 


Forecast error EPS + stock price* 
Compensation plan: yes or no 


. Impactofdep. change on EPS + stock price’ 


Book value of public debt + 
(stock price X shares)*” 
Book value of private debt + 
(stock price X shares)* 


. Inventory of payable funds + 


(stock price X shares)* 


. Book value of public debt + book value of 


private debt + (stock price Xx shares )* 


. Net property + firm value’. 
. Mkt. value ofstock + book value of current 


liabilities, long term debt and preferred stock’. 
Book value of bank loans, public and private 
debt + firm value” ; 

Book value of preferred equity + firm valuc* 


. Outside director 


Frequency of reporting in 1937 
Stock exchange listing 


Compensation plan: yes or no 
Concentration ratio® 
Beta 


. Logofnet sales* 


Gross fixed assets + sales* 
Total debt + total assets* 


. Mkt. value ofowners’ equity + book value 


ofdebt* 

Debt + (measure defined in 1 above)* 
Number of accounting measures used in 
debt convenants 

% management ownership 

Stock registration: NYSE or OTC 


Owner controlled: yes or no 
Total assets* 
Debt + equity* 


Compensation plan: yes or no 

Debt + equity* 

Total consolidated assets* 

% management ownership ' 

Foreign sales + total consolidated sales* 
Remuneration percentage’ 
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Kelly (1985) 1. Lobby on SFAS #8 1. Foreign sales + total consolidated sales* 
2. Total debt + total assets* 
3. % management ownership 
4, Totalassets* . 
Wong (1988) 1. Disclosure of current cost 1. Income tax expense (net of deferred tax) + 


financial statements 


net income before taxes* 

2. Long-term Habilities + (total assets — 
current liabilities )* 

3. Market concentration ratios measured 
using net income* 

4. Net income before interest and taxes + 
total assets* 

5. Gross fixed assets + total assets* 

6. Netincome after taxes before 
extraordinary items* 





“Denotes on accounting measure. 


Most of the studies listed in Tablé 1 are con- 
cerned with accounting procedures; Watts & 
Zimmerman (1978, p. 112) state that, “Ulti- 
mately, we seek to develop a positive theory of 
the determination of accounting standards.” Of 
course, accounting standards are accounting 
procedures, i.e., the rules that are applied for in- 


tegrating transaction events into the financial 


statements, It seems to be the case with positive 
theorists that the accounting practice to be 
explained by a positive theory must certainly in- 
clude the procedures accountants employ to 
calculate such things as total assets, total 
liabilities, owners’ equity, net income, etc. If ac- 
counting procedures are not part of the pheno- 
menal domain of accounting practice then most 
of what has been written under the rubrics of 


positive theory and positive research would, 


make little sense. 

` As Christenson (1983, p.17) notes, the em- 
phasis in PAT is on explanation. To explain an 
event, in the view of science endorsed by posi- 
tive accountants, is to come to accept the event’s 
causes. The importance of causality in an expla- 
nation is noted by Watts & Zimmerman (1986, 
p.4): 





The public accountant or corporate manager may ob- 
serve an association between variables such as changes in 
procedures and changes in stock prices, but cannot tell 
whether the association is causal.. To make the causality 
interpretation, the practitioner requires a theory that 
explains the relation between the variables. The theory 
enables the practitioner to attach causality [emphasis 
added] to a particular variable, such as a procedure 
change. 


Clearly, in Watts & Zimmerman’s view it is the 
role of theory to permit the attribution of causal- 
ity.’ 

If PAT is to be explanatory in the sense en- 
visioned by positive theorists, it must contain at 
least one premise or proposition that permits 
causal attribution. What the theoretical proposi- 
tions are for positive theory can be inferred from 
the positive literature, there are two that are ap- 
parent. : 

One proposition pertains'to the motives of 
managers; it is the self-interest proposition. In 
their seminal paper, Watts & Zimmerman (1978, 
p.113) state “...we assume that individuals act to 
maximize their own utility.” Utility provides 
management with the motive for preference on 
accounting procedures, i.e., “(T)he obvious im- 
plication of this assumption is that management 


it should be emphasized here that positive theorists have been perhaps rather facile in their treatment of causality. One of 
the reviewers of this paper noted that the inadequacy of PAT’s notion of causation is still one more difficulty not overcome. 
Though this issue may be somewhat germane to the argument in this paper, particularly with respect to the problem of the 
“similarity of events”, incorporating it would mean venturing too far afield. For a brief elucidation (with bibliography ) of the 


philosophical problems of causation see Taylor (1972). 
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lobbies on accounting standards based on its 
own self-interest” (p.113). Watts & Zimmerman 
equate self-interest or utility with economic in- 
terests: “We assume that management’s utility is 
a positive function of the expected compensa- 
tion in future periods (or wealth) and a negative 
function of the dispersion of future compensa- 
tion (or wealth). The question is how do ac- 
counting standards affect management’s 
wealth?” (p.114). We can thus state positive ac- 
counting theory’s first theoretical proposition 
as: Managers’ economic self-interests determine 
their preferences for accounting procedures.’ 
The managerial self-interest or motive propos- 
ition is by itself insufficient to provide the causal 
explanation of accounting practice that positive 
theorists seek, since management preference is 
not accounting practice. What positive account- 
ing theorists have done to provide for an expla- 
nation of accounting practice is to attribute to 
management a causal role? Watts & Zimmer- 
man (1978) state quite explicitly a belief in the 
causal efficacy of management: “Management, 
we believe, plays a central role in the determina- 
tion [emphasis added] of standards” (p.113). 
This attribution of causality has been reiterated 
by subsequent researchers putatively testing the 
theory. For example, Holthausen (1981, p. 73) 
described a motivation for his study as follows: 


“Recently, however, researchers have begun to’ 


examine management’s incentives to influence 
the menu [emphasis added] of accepted ac- 
counting techniques and to choose among avail- 
able alternatives.” In a similar vein, Zmijewski & 
Hagerman (1981, p. 129) imply a causal role for 
management, i.¢.: “These actions (managment’s 
lobbying activity) are designed to influence the 
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set [emphasis added] of generally accepted ac- 
counting principles (GAAP) from which a firm 
may choose.” Later in the same article these two 
authors allude to both propositions (pp. 129— 
130): 


The question is, assuming economic rationality, what are 
the benefits justifying these costs (management's lobby- 
ing and changing accounting principles)? In response to 
this question a positive theory of the determination [em- 
phasis added] and choice of accounting principles is 
being developed. 


Obviously, if managers are rational, they would 
not incur costs to influence adoption of account- ` 
ing procedures if there were no prospect they 
could succeed. If managers are not in some, even . 
limited, sense causal of accounting practice, at- 
tempting to understand their behavior makes lit- 
tle sense for an explanatory theory of accounting 
practice. We can state succinctly PAT’s second 
theoretical proposition as: Managers’ prefer- 
ences for accounting procedures affect the ob- 
servable set of existing accounting procedures.‘ 

Having identified the two central propositions 
of PAT, it is now possible to construct a critique 
of the experiments of those researchers who 
purport to be testing the theory. The behavior of 
such positive researchers can be shown to imply 
that the two fundamental propositions of PAT 
cannot be believed by them to be true simul- 
taneously. For economy of exposition, the de- 
monstration of the logical incoherence of PAR in 
the next section will be developed by focusing 
on only one representative PAR study listed in 
Table 1; the implications extend to all studies 
listed there. 


2AL tests of positive accounting theory have focused exclusively on wealth variables. If other “utility”-yielding variables are 
deterministic of managers’ choices of procedures they have yet to be incorporated into positive theory. 


*Por an explanatory theory of accounting practice to have almost singularly focused on the choices of management must 
imply that managers’ choices of accounting procedure must be viewed as important determinants of the set of accounting 


procedures we are able to observe. 


‘Tt is admittedly the case that many “groups,” e.g, regulators and CPAs, participate in the process of creating accounting 
procedures. The theoretical proposition is not meant to imply that management is strictly causal, for such is probably not 
factual. It does imply that knowledge of management preferences provides knowledge about what permissible accounting 
procedures one is likely to observe, Le., management is a causal agent, even if only one among many. 


THE LOGICAL PROBLEM WITH PAR 


The logical problem with PAR will be 
explained through the use of a syllogism created 
from the theoretical propositions of PAT derived 
in the previous section. These two propositions 
are the self-interest proposition (henceforth 
labeled TP1): 


Managers’ economic self-interests determine their pre- 
ferences for accounting procedures; 


and the causal agent proposition (henceforth 
labeled TP2): `, 


Managers’ ces for accounting procedures affect 
the set of existing accounting procedures. 


These two propositions are conjuncts and can 
be combined to form the compound statement 
of PAT (henceforth labeled TP): 


Managers’ economic self-interests determine their pre- 


ferences for accounting procedures and managers’ pre- 
ferences for accounting procedures affect the set of exist- 


ing accounting procedures. 


This statement is a truth-functional compound 
statement (see, e.g., Copi, 1979). If both compo- 
nents (TP1 and TP2) are “true”, then positive 
theorists may have the explanatory theory of ac- 
counting practice (defined as accounting proce- 
dures), that they seek. But it is also the case that 
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if either TP1 or TP2 is not “true”, then TP is also 
not “true”.> , 

The next step in constructing the syllogism is 
to combine the theoretical proposition TP with 
a factual statement (henceforth labeled FS): The 
set of existing accounting procedures deter- 
mines existing accounting measures.° This state- 
ment is called factual since if it is not considered 
factual the specification of the phenomenon PAT 
is constructed to explain is substantially altered; 
the “facts” PAT deals with disappear. If managers 
are rational and if the selection of accounting 
procedures do not determine the resulting ac- 
counting measures of managements’ perform- 
ance, there would be no accounting procedures 
choice problem to explain. It is because diffe- 
rent accounting procedures produce different 
accounting measures that managers expend 
energy trying to create and choose among pro- 
cedures. 

Combining TP with FS permits deducing a 
theoretical conclusion (labeled TC), which 
completes the syllogism. The complete set of 
logically related statements, with conclusion, is 
as follows: 


TP: Managers’ economic self-interests determine their 
preferences for accounting procedures and managers’ 
preferences for accounting procedures affect the set of 
existing accounting procedures. 

PS: The set of existing accounting procedures determines 
existing accounting measures.’ 

TC: Managers’ economic self-interests affect existing ac- 
counting measures. 


Ae 


>Since the term “true” is used in reference to propositions that are purportedly testable by experimental means, for these 
propositions to be “true” to positive researchers the ideas expressed by them must correspond to the facts. The activity of 
positive researchers is directed toward establishing whether such correspondence exists Le., are the implications for belief 
of the PAT propositions publically acceptable. The discussion in this paper is concerned not with truth conditions but with 
whether the “facts” being offered as evidence by positive researchers can be reasonably accepted as such. 


°The term “accounting measure” is understood in the sense in which accountants use it. The number appearing after 
“Accounts Receivable” on a balance sheet is 2 “measure” of the asset, accounts receivable. The “bottom linc” of an income 
statement is an accounting “measure”. Financial ratios are accounting “measures”. 


7Some may take issue with the use of the verb “determines” in the factual statement. Do not other “things” also “determine” 
accounting measures? The answer to that may be yes or no. Given a set of accounting procedures, other things determine 
accounting measures, but given a distinct set of other “things”, accounting procedures determine accounting méasures. The 
only thing required for the argument being developed in this paper is to grant what positive researchers must grant if they 
are to continue to have a problem to research: accounting procedures determine accounting measures to an extent greater 
than “trivially”, Le., the effect is one that cannot be ignored. 
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The theoretical conclusion is a logically true 
statement if both TP and FS are true.® 

Equipped with the set of logically related PAT 
propositions, it is now possible to demonstrate 
that positive accounting researchers have intro- 
duced an incoherency to their experiments by 
making it logically impossible for the compo- 
nents of the theoretical proposition (TP1 and 
TP2) to both be true simultaneously while pre- 
serving unambiguous meanings for the measure- 
ments made to test PAT. To give meaningful in- 
terpretation to the results of PAR experiments 
requires denying one of the component propos- 
itions. The logical problem arises when positive 
researchers translate the language of the theory 
into a language that permits empirical testing. 
How this translation process occurs and the log- 
ical result it produces will be illustrated using 
one paper representative of those informed by 
PAT, that by Hagerman & Zmijewski (1979) 
(henceforth HZ)’ 

In the introduction to their paper (p. 142), HZ 
indicate the paper’s purpose: “(T his paper is de- 
signed to determine if economic motives are de- 
terminants in the choice of alternative account- 
ing principles.” In effect, the experiment re- 
ported in the paper was a test of the truth value 
of TP1, as are all of the other studies listed in 
Table 1. Because their study was informed by 
PAT, management’s choices of accounting pro- 
cedures were the dependent variables. 

For HZ or any other positive researchers to 
test whether “... economic motives are deter- 
minants in the choice of alternative accounting 
principles,” they must alter the language of TP so 
that it has empirical meaning. Since managers’ 
economic self-interests are not observable, HZ 
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define certain economic variables they believe 
can be defined operationally and substituted for 
managers’ economic self-interests. These econ- 
omic variables are linked to managers’ economic 
self-interests by the logical arguments provided 
by PAT. 

In their study, HZ identify five such economic 
variables: size, risk, capital intensity, competi- 
tion, and incentive plans. They link these vari- 
ables to managerial self-interest via the standard 
arguments. For example, size is associated with 
the political costs firms may incur when sub- 
jected to political attack from groups lobbying 
to transfer wealth from firms, and thus from man- 
agers, to themselves. HZ argue (pp. 142—143): 


Firms have a variety of tactics they can employ to reduce 
these costs... the management may reduce reported net 
income in order to avoid drawing the lobbyists’ attention ` 
to themselves. The reasonableness of this tactic is clear 
when one remembers how many new (sic) broadcasters 
have reported spectacular earnings increases of large 
firms. Thus large firms will have an incentive to choose 
accounting standards which reduce net income in order 
to avoid publicity and opprobrium. 


Similar arguments are made to link the other var- 
iables to economic interests, They are: 


1. Risk: *... riskier firms will appear to make excessive 
profits and thus be subject to negative wealth transfers... 
so we expect higher risk firms to choose income deflat- 
ing alternatives” (p. 142); 

2. Capital intensity: “Thus we hypothesize that firms 
that are relatively capital intensive and subject to politi- 
cal costs will have an incentive to reduce reported in- 
come by selecting the appropriate accounting princt- 
ples” (p. 143); 

3. Competition: “Therefore, we assume that the more 
concentrated the industry is, the greater is the likelihood 


®The derivation may easily be demonstrated if the propositions are put in functional form. Functionally, 
TP1 is: Management preferences = f (Economic self-interests), and 
TP2 is: Existing accounting procedures = g (Management preferences), 


and 


FS is: Existing accounting measures = h (Existing accounting procedures). 


By substitution: 
Existing accounting measures = h (g(f(.))). 


°This paper was selected for two reasons. The first is that it provides an example of the logical problem with PAR with which 
other PAR studies can be equated. The second is that it is an important paper. According to Brown & Gardner's (1985) 
citation analysis, it is the fifth most influential paper in accounting published since 1979. 
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of either anti-trust or entry, hence the greater the incen- 
tive to choose accounting methods that reduce reported 
profits” (p. 145);and 

4. Incentive plans: “If management incentive schemes 
are related to accounting earnings we expect that man- 
agement has an incentive to use accounting principles 
that increase accounting earnings if part of their income 
is derived from incentive plans” (p. 145). 


Once positive researchers like HZ have iden- 
tified economic variables with economic self-in- 
terest, they substitute these variables into the 
theoretical proposition of PAT. The proposition 
(labeled TP-1) now reads: 


TP-I: Economic variables determine managers’ prefer- 
ences for accounting procedures and managers’ prefer- 


ences for accounting procedures affect the set of existing 


accounting procedures. 


“Economic variables” stand-in for the unobserv- 
able economic self-interests of managers. 

_After economic variables have been enumer- 
ated, it is necessary to identify proxies for them 
to define them operationally.'° Without opera- 
tional definitions for economic variables, empir- 
ical testing of PAT is not possible. Positive re- 
searchers have tended to rely very heavily on 
existing accounting measures created from the 
financial statements of firms as their proxies for 
economic variables. For the studies listed in 
Table 1, over 70% of all proxies used were 
created from reported accounting numbers. 
Continuing with the HZ example, these resear- 
chers followed the practice of using existing ac- 
counting measures for their proxies, i.e.: Size = 
Total assets or Sales (accounting measures); Risk 
= Beta; Capital Intensity = Fixed assets + Sales 
(accounting measures}; Competition = Con- 
centration ratio (accounting measures ); and In- 
centive Plans = Compensation plan exists. 

When these existing accounting measure pro- 
xies are substituted for economic variables into 
TP-I, the original theoretical proposition of PAT 
is further altered and becomes (labeled TP-II): 
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TP-I; Existing accounting measures determine mana- 
gers’ preferences for accounting procedures and mana- 
gers’ preferences for accounting procedures affect the 
set of existing accounting procedures. 


“Existing accounting measures” now stand-in for 
economic self-interests and serve as measures of 
preference. 

The theoretical proposition, thus altered, has 
been given empirical meaning, i.e., managers’ 
self-interests are defined in terms of observables 
that are quantified. The altered proposition is in- 
deed the one tested by positive researchers. 
Every study listed in Table 1 was a test of TPI 
and every one used existing accounting mea- 
sures as the vast majority of independent vari- 
ables to predict managerial preferences for ac- 
counting practices, e.g., proposed standards, ac- 
counting procedures, external auditing. 

The problem that positive accounting resear- 
chers create for the interpretation of their ex- 
perimental results is revealed when TP-I is sub- 
stituted for the original theoretical proposition. 
TP, to yield an empirical (observable) conclu- 
sion: 


TP-I: Existing accounting measures determine mana- 
gers’ preferences for accounting procedures and mana- 
gers’ preferences for accounting procedures affect the 
set of existing accounting procedures. 

FS: The set of existing accounting procedures determines 
existing accounting measurs. 

Empirical Conclusion (EC): Existing accounting mea- 
sures affect existing accounting measures. 


This empirical conclusion appears to be a 
tautologous statement, That accounting mea- 
sures affect other accounting measures is a rec- 
ognized fact. Do positive researchers still pos- 
sess a theory whose propositions enable the de- 
duction of a fact about accounting practice or 
have they merely made the theory, empirically, 
tautologous by defining managers’ economic 
self-interests in terms of accounting practice? If 
“existing accounting measures” as the subject of 
EC is reliably interpretable as “measures of econ- 
omic self-interest” then the former may likely be . 


‘The issue of selecting better or worse proxies is a problem in its own right but it not germane to the issue at hand. This paper 
is concerned with the conditions under which a possible proxy can even be considered a reasonable candidate. 
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the case. But if the ability to sustain confidence 
in that interpretation of “existing accounting 
measures” is compromised then the latter may 
be the more likely case. In the following sections 
an attempt is made to demonstrate from the be- 
havior of positive researchers that the latter is in- 
deed the case. 


SUSTAINING THE MEANING OF “EXISTING 
ACCOUNTING MEASURES” 


Positive researchers may be excused from 
having pursued the expedient path of making 
PAT true, and thus guaranteeing some significant 
correlations, by defining managerial self-interest 
in terms of accounting practice if their research 
can be interpreted as valid attempts to establish 
the truth value of TP1 and TP2. Because, as pre- 
viously noted, for PAT to be an explanatory 
theory of accounting practice, its two fundamen- 
tal propositions must be believable. They must 
be “true.” 

All of the studies listed in Table 1 are tests of 
TP1. The studies are reports of experiments 
each designed to test the hypothesis of whether 
the economic self-interests of managers deter- 
mine their preferences for accounting practices. 
The results of these studies have yielded rather 
mixed results but supportive enough of TP1 that 
it is still maintained as a hypothesis. PAT has not 
yet been altered to suggest some other explana- 
tion for managers’ preferences for accounting 
practices. The question is whether these at- 
tempts to test TP1 provide valid results within 
the context of PAT. As noted in the previous sec- 
tion, the validity of these results depends upon 
whether existing accounting measures are reli- 
able indicators of managements’ economic self- 
interests; ie., do they connote preferences? The 
reliability of these measures can be interpreted 
within two contexts: TP2 is “true” or it is “false”. 


If TP2 is true 

‘For the purpose of the argument to follow, 
TP2 is true if knowledge of the preferences of 
managers for accounting procedures alters the 
probability one would assign to observing a par- 
ticular procedure béing among the set allowed 
in practice. If TP2 is true, two interpretations of 
PAR results are possible depending on interpre- 
tations of managers’ behavior in pursuit of their 
economic self-interests. 

One interpretation of self-seeking behavior is 
that it implies management is manipulative. That 
means that managers’ preference for accounting . 
procedures are for those that allow them to tell . 
stories economically beneficial to them. If TP2 is 
true with such manipulative managements, then 
how do positive researchers attribute unam- 
biguous economic meaning to existing account- 
ing measures? There is a reflexivity; managers 
are manipulating those variables to represent 
the “economic” facts in a manner beneficial to 
themselves.'' How can positive researchers, 
with the reliability required ofa “scientific” mea- 
surement, interpret those variables as contain- 
ing information of economic substance? Ac- 
counting measures in such circumstances can 
convey little more than information about mana- 
gers’ past preferences for accounting proce- 
dures. This point can be illustrated with a simple 
example from positive theory. Sales and Total 
Assets are accounting numbers used to measure 
political costs. Positive theory predicts that, 
ceteris paribus, firms that are “larger”, measured 
by Sales or Total Assets, will have higher poten- 
tial political costs and their managements will, 
therefore, attempt to create and choose income- 
reducing accounting procedures. This strategy 
is deemed rational because of the assumed im- 
portance of profits (namely “excess profits”) as 
an event prompting the imposition of increased 
political costs. But how can managers affect in- 
come-reducing procedures without affecting 
either Sales or Total Assets? By the arithmetic of 


“One might argue there are no economic “facts” conveyed by accounting numbers. The response to this as if there is no 
extant idea of communicating reliable economic data then manipulating accounting measures makes no sense. For example, 
the notion of income smoothing is irrational unless one believes something of economic significance is being communicated 


other than that management is acting to smooth income. 
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the income equation, managers can reduce in- 
come by decelerating the recognition of Sales re- 
venue. Or, they can accelerate the recognition of 
expenses, which in turn decelerates the growth 
of Total Assets. Since the process of managerial 
creation and selection of accounting procedures 
is ongoing, if managers are indeed effective man- 
ipulators of accounting measures there is no reli- 
able way to know what such public accounting 
measures mean. Do they say something about 
the relative political costs of firms or simply the 
relative effectiveness of managers as man- 
ipulators of accountiag numbers? There is no 
justifiable criterion offered for deciding. 

The original explicators of PAT, Watts & Zim- 
merman, subscribe to the manipulative charac- 
terization of management .behavior. In describ- 
ing the value of their theory (Watts & Zimmer- 
man, 1986, p. 356), they make the following 
claim: 


The theory provides investors and financial analysts with ` 


a useful predictive model of the accounting procedures 
underlying the financial statements. Using the theory, in- 
vestors or analysts do not [emphasis added] interpret ba- 
lance sheet and earnings numbers as unbiased estimates 
of firm value and changes in firm value. Instead, they rec- 
ognize the effect of the contracting and political proces- 
ses on the calculation of earnings and balance sheet num- 
bers. For example, the manager’s incentives to choose 
carnings increasing/decreasing accounting methods de- 
pend on the existing compensation and debt contracts 
With knowledge of these contracts, the analysts can ad- 
just the reported numbers [emphasis added]. In particu- 
„lar, if Healey’s (1985) evidence regarding the effect of 


ee ee ee o : 


11) is confirmed, an investor or analyst could adjiist 
earnings number for expected management manipu 
Hons [emphasis added] in deriving cash flow estimates. 
This would heip the investor oranalyst better predict the 
market value of nontraded stocks or bonds. 


What Watts & Zimmerman are asserting is that 


for the purposes of financial analysis, accounting’ 


measures are biased by the manipulative be- 
havior of management and that this behavior is 
predicted by their theory (more specifically by 
tests of TP1). Yet this theory has been tested 
using these same, unadjusted accounting mea- 
sures. What one is required to believe is that, 
though management affects accounting mea- 
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sures that are biased and, therefore, unreliable 
for purposes of assessing the value of the firm, 
these same accounting measures are unbiased 
and reliable indicators of the economic self-in- 
terests of managers. To entertain the idea that 
one is conducting scientific experiments re- 
quires that the scientist provide some assur- 
ances that the measurements he is Using are reli- 
able indicators of the magnitude of treatment 
factors. His measurements of independent vari- 
ables must be independent of his theory, ie., 
how he interprets his observations of indepen- 
dent variables should not depend upon the 
status of his theory. In Watts & Zimmerman’s 
case, another theory is needed that t predicts that 
no matter how much management sflanipulates 
accounting measures to its economic advantage, 
the accounting measures so produced are al- 
ways unbiased measures of the economic self-in- 
terests motivating managers to manipulate. If 
TP2 is true, with manipulative management, 
skepticism about the “evidence” positive resear- 
chers have produced seems to be in order until 
such a theory is produced. It would seem for 
now that if convincing tests of TP1 are to be 
made under the assumption of successfully man- 
ipulative managers, the measures of the indepen- 
dent variables must exclude measures produced 
by those managers. 

The other interpretation of managerial self- 
seeking behavior under the condition that TP2 is 
true is that it is not manipulative. That is equiva- 
lent to granting to the positive researcher his as- 
sertion that accounting measures used as inde- 
pendent variables do provide reliable measures 
of managers’ economic self-interests, Thus, man- 
agers’ preferences for accounting procedures 
produce accounting measures ‘that have an 
economic interpretation; they tell the re- 
searcher something about what managers’ econ- 
omic self-interests are. Therefore regardless of 
their motives, managers are somehow guided to 
create and prefer accounting procedures that 
yield economic interpretations that not only tell 
about their own behavior but also publicize 
their economic interests. The implication of 
granting this status to accounting measures is 
the apparent loss of self-interest as an explana- 
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tion. If the accounting procedures preferred by 
managers produce accounting measures be- 
lieved by positive researchers to be reliable indi- 
cators of economic constructs, then it can be 
said that in a very real sense the choices of man- 
- agers are not manipulative; they are veracious. 
Since managers’ past choices are used as mea- 
sures of independent variables and those mea- 
sures are assumed to be unbiased measures, then 
self-interest and veracity become indistinguisha- 
ble. It is extremely difficult to discern whether 
managers’ accounting preferences are moti- 
vated by economic self-interest or by a desire to 
tell the truth, at least to positive researchers. 

Perhaps another way of stating the above con- 
clusion will add some clarity. Experimentally 
positive theory states that managers’ prefer- 
ences for accounting procedures are function- 
ally determined by accounting measures of man- 
agers’ economic self-interests, i.e.: 

Managers’ current preferences = f (accounting mea- 
sures). 


But the accountingsfheasures are believed to 
measure in an unbiased way the economic vari- 
ables that tell the researcher about managers’ 
economic self-interests, so: 


Managers’ current preferences = f (accounting measures 
that are unbiased), 


where unbiased means that the researcher be- 
lieves them to convey knowledge about econ- 
omic phenomena motivating managers. 

The assumption that TP2 is true permits the 
following statement: 


Managers’ current preferences = f (managers’ past pre- 
ferences which produced unbiased measures). 


An experiment confirming this relationship that 
unbiased accounting measures predict -matia- 
gers’ current preferences, i.e., coefficients of 
the variables are statistically significant, would 
imply that managers’ behaviors are consistent 
with their economic self-interest, but it also im- 
plies that their current preferences will produce 
unbiased measures in the future. This last impli- 
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cation appears to be a necessary assumption for 
the following reason. 

Since scientists are concerned with answering 
questions of the general form, What should one 
expect (Toulmin, 1986)?, the experimenter as a 
scientist asserts that his result permits him, 
when he performs a similar experiment for the 
next accounting preference, to expect to obtain 
a similar result. His confidence that his next ex- 
periment will produce the expected result is jus- 
tified only if he believes that the preference he 
has just investigated results in unbiased mea- 
sures. He has already implicitly assumed that all 
preferences that preceded his recently com- 
pleted experiment did so. If managers’ current 
preferences do not produce unbiased measures 


- in the future, measurement error of the indepen- 


dent variables in that future experiment has 
been induced. This would seriously -com- 
promise the confidence of any experimenter in 
his ability to replicate his result in the future. 


' With biased measures, future experiments may 


not result in economic variables predicting man- 
agers’ preferences. 
Only if managers’ choices lead to unbiased 


‘measures is the experimenter’s confidence in his 
_ result as a sctentific result justified. But if mana- 


gers’ choices always lead to unbiased measures, 
then it becomes moot whether economic self-in- 
terest or a desire to disclose truthfully what their 
self-interests are motivates managers in their ac- 
counting preferences. How does one tell the dif- 
ference empirically? The positive researcher 
seems to sacrifice his ability to establish empiri- 
cally the truth of TP1. If managers’ preferences 


„always produce unbiased measures of their 


economic self-interests, then those measures 
cannot distinguish what the motives for those 
preferences are. Some accounting measures that 
would be different between self-interest and 


_veracity as motives are not available. 


In conclusion, if TP2 is true, then the PAR 
results produced thus far are not meaningfully 
interpretable. If managers manipulate, then ac- 
counting measures cannot just be assumed to be 
unbiased measures of managers’ economic self- 
interests. If the positive researcher wishes to 
continue to assert that he is entitled to regard ac- 
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counting measures as unbiased measures of 


managers’ economic self-interests then whether ' 


_ or not TPI is true does not matter. Accounting 
measures produced by managers’ preferences 
are the same regardless of motives; the truth of 
TP1 cannot be determined using accounting 
measures. 


If TP2 is not true 

The second context within which to consider 
the positive researcher’s belief in the reliability 
of accounting measures is that of TP2 being un- 
true, i.e., managers’ preferences for accounting 
procedures do not affect the existing set of ac- 
counting procedures permitted in practice. This 
argument is a more plausible one than that under 
the condition that TP2 is true. 

The assumption that management does not af- 
fect accounting practice implies that some regu- 
lar (or irregular) process or mechanism, poten- 
tially involving a multitude of participants (liv- 
ing and dead), is being assumed to affect ac- 
counting practice instead of management. Vari- 
ous analogies of this process are available, e.g., a 


cartographic analogy (Solomons, 1978) or ac-- 


counting practice as analogous to a commodity 
‘sold in a competitive market (agency theory és 
one such characterization). Whatever the anal- 
ogy, always tacitly evoked, the researcher relies 
upon #f to provide the assurance needed to as- 
cribe economic meaning to his independent var- 
iables. This dependence on some assumed pro- 
cess which always produces reliable accounting 
measures has equally troublesome implications 
for the positive researcher. 

Reliance on “something” other than manage- 
ment implies that the process of creating ac- 
counting practice does not require management 
participation. Accounting measures would be 
the same whether management prefers one ac- 
counting procedure or another.!? This must be 
asserted, for if the measures were not the same 
we would be back to TP2 being true and the ar- 
guments given in the previous section would 
hold. Results of the experiments listed in Table 1 
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might then be legitimately claimed to be tests of 
TP1; managers’ economic self-interests do deter- 
mine their preferences for accounting practices. 
But that assurance comes only by destroying 
PAT as an explanatory theory of accounting 
practice. If positive researchers want to argue 
that their measures of managers’ economic self- 
interests are reliable because TP2 is not really 
true, then the theory that informs them no 
longer has an explanatory proposition. Managers 
are no longer causal PAR is now the same kind 
of research accountants have been doing for 
many years, i.e., Lens-model research. Account- 
ing variables, independently produced, are used 
to predict choice; accounting practice is being 
used to explain managers’ choices which is the 
opposite of what PAT purports to do. Positive re- 
searchers deprive PAT of its self-proclaimed 
Status as an explanatory theory of accounting 
practice. 


CONCLUSION 


`The conclusion of this paper is that an 
explanatory theory of accounting practice can- 
not be convincingly tested by resorting to ac- 
counting practice as its own explanation. It was 
demonstrated that the practice of using existing 
accounting measures as proxies for manage- 
ments’ economic self-interests makes PAT 
tautologous at the experimental level. This prac- 
tice makes it logically impossible for the two fun- 
damental theoretical propositions of PAT to be 
,true simultaneously. If the results of PAR (tests 
of the self-interest hypothesis) are to be ac- 
cepted with any degree of confidence, then man- 
agers cannot be assumed to be causal. If mana- 
gers are assumed to be causal, then either the 
results of PAR are uninterpretable or the self-in- 
terest proposition becomes undecidable. It is 
not logically possible, using accounting mea- 
sures, to establish simultaneously the truth value 
of the two propositions. 
What would seem to be minimally necessary 


“Here “same” does not necessarily mean identical. “Same” implies immaterial difference, where immateriality is decided by 


the positive researcher. 
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for positive researchers to actually test the prop- 
ositions of PAT would be the design of experi- 
ments in which the truth of one proposition is 
not crucial to the test of the other and “facts” do 
not have to be assumed. For example, the way 
“political costs” have been tested thus far is to in- 
clude an accounting measure of size as an argu- 
ment in a linear decision model. What “facts” are 
implicitly assumed? One is that a linear decision 
model is appropriate; two is that larger firms ac- 
tually bear higher political costs (an assertion, 
not an empirically tested proposition); and three 
is that accounting measures are unbiased esti- 
mates of “size”. If results are not significant, is it 
because managers are not self-interested, or that 
linear models do not capture very well the deci- 
sion process, or that size is a poor indicator of ex- 
posure to political costs, or that accounting mea- 
sures are poor measures of “size”? There is no 
way to tell. 

Independent tests of TP1 and TP2 are possi- 
ble, but may very well require accounting re- 
searchers to largely abandon the equating of 
“scientific” behavior with sophisticated financial 
statement analysis that is the characteristic of so 
much current empirical work in accounting. 
PAT’s substantive contribution has been its em- 
phasis on accountability, not predictive useful- 
ness, as the organizing principle of accounting. 
It, perhaps properly, made financial statements 
the dependent variables, but has confused things 
empirically by resorting to the comfortable 
financial analysis simile and making the same 
statements its independent variables. Given the 

` current institutional make-up of the process by 
which accounting procedures are created, the 
PAT perspective would seem to lead some re- 
searchers to consideration of political power 
and the question of managements’ abilities to af- 
fect the outcomes of this process. This requires 
experimental designs necessitating that data be 
gathered from sources other than those shared 
with financial analysts. 

For example, one experiment to test TP2 
could take advantage of the documented history 
of the 97 FASB standards actually adopted. Some 
of these standards underwent major changes 
from initial exposure to final form. In addition to 
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adopted standards, some proposed standards 
were “killed”. With a sample of issues that large, 
it should be possible to assess the relative likeli- 
hoods of positions advocated by management 
(weighted, perhaps) being adopted versus not 
being adopted. Such a study might provide some 
evidence about the extent of management’s 
power in creating the story told about it. 

Another conclusion of this paper is the ques- 
tion it raises of how PAR could be conducted for 
nearly a decade without anyone noting that de- 
fining managerial self-interests in terms of the 
phenomenon such self-interest is to explain 
does serious damage to the believability of any 
results? Notable about the studies listed in Table 
1, which span a decade, are that they employ the 
same “method”, and their results, when consi- 
dered in toto, are inconclusive. No consistent 
results emerge. No variable persists in signifi- 
cance across studies and each situation investi- 
gated produces significant variables (if any) that 
are unique. Results thus far indicate a- great 
amount of situational specificity to PAT with, as 
yet, no real evidence to know whether it is poor 
theory or poor tests of the theory. 

This decade-long use of the same method, 
which produces no consistent results, to investi- 
gate a theory that in spite of those results has suc- 
cessfully resisted any emendment suggests two 
things. The first is that relying so heavily on 
COMPUSTAT tapes may not be the best way to 
subject PAT to rigorous testing. Perhaps case 
studies of procedure choice or lobbying be- 
havior are in order to begin building a meaning- 
ful data base with which to test such theories of 
accounting practice. E 

The second thing that PAR’s decade-long con- 
stancy suggests is that a potential problem may 
exist with quality control. Whitley (1984, p.66) 
notes the tendency in academic disciplines for 
competence to become defined in terms of 
people’s abilities to employ various techniques. 
Zeff (1983, p. 134) alludes to this same occurr- 
ence in accounting: “When modeling problems, 
researchers seem to be more affected by techni- 
cal developments in the literature than by their 
potential to explain phenomena”. After ten years 
there is some reason to believe that the tech- 
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niques thus far used to test PAT are inadequate to the task. 
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Abstract 


This research examined the effects of information order and hypothesis-testing strategies on audit 
judgments. Auditors started with high or low prior beliefs about an internal control system and then 
reviewed both positive and negative evidence. The results suggest that, unless specifically requested to do 
so, auditors do not generally seek confirming evidence. We also found that prior beliefs have an effect on 
the importance of information order: with high prior beliefs, subjects’ judgments were unaffected by 
information order, while low prior beliefs were associated with a recency effect. 


Auditors’ tasks require the gathering ofevidence 
and testing of hypotheses in a variety of decision 
contexts. As demonstrated by the extensive 
documentation found in audit working papers, 
evidence gathering and interpretation are sig- 
nificant components of the audit function. In this 
research, we examine (1) whether auditors use 
a hypothesis-testing strategy in which they 
search for or emphasize information which con- 
firms theic hypotheses (prior beliefs), (2) 
whether the order of information discovery af- 
fects auditors’ decisions, and (3) the conditions 
under which hypothesis-testing strategies might 
dominate the effect of information ordering on 
audit decisions. 


Auditors are concerned with how evidence is, 
interpreted and used in decision making since 


these activities are an integral part of the audit. 
Whether or not auditors generally seek confirm- 
ing evidence, as opposed to disconfirming evi- 
dence, and whether or not audit decisions are 
subject to a recency effect are both issues which 


„could have a very practical impact on audit deci- 
‘sions. A study of these factors may result in infor- 


mation that would be useful in designing audit 
procedures and working papers. 

Together, the idea of a confirmation effect and 
the idea of a recency effect in information pro- 
cessing suggest competing hypotheses about 
how a decision maker will interpret evidence. 
Specifically, will a confirmation-seeking bias 
dominate a possible recency effect? We investi- 
gated this question by randomly assigning au- 
ditors to one of three hypothesis-testing strategy 
conditions: confirming, disconfirming, or neut- 
ral; and by varying the order of presentation of 
information about the internal controls ofa com- 
pany being audited. We had two different back- 
ground scenarios which provided two condi- 
tions for prior beliefs: high and low. These three 
independent variables — priors, hypothesis-test- 
ing strategy, and order — were completely cros- 
sed ina 2X 3 X 2 factorial design. 

The next section of the paper contains a 
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review of the relevant literature and the hypoth- 
eses to be tested in this research. Section 3 pre- 
sents the design and procedure used to test our 
hypotheses, and Section 4 presents our results. 
In conclusion, Section 5 contains a discussion of 
these results and their implications. ` 


THEORY AND HYPOTHESIS DEVELOPMENT 


Order effects in updating beliefs 

A great deal of research has addressed the 
problem of how people update their beliefs; that 
is, how new information is integrated with prior 
beliefs.'! While the accepted normative model is 
the Bayesian model, psychologists have attemp- 
ted to provide a descriptive model of this pro- 
cess. Recently, Einhorn & Hogarth (1985) 
suggested a sequential anchoring and adjust- 
ment process which they call a contrast/surprise 
model.” In this model, one’s current position 
provides an anchor, and adjustments are made as 
new information is received. Each adjustment 
results in a new position, Le. a new anchor, and 
the process continues in this sequential fashion. 
For situations in which a decision maker con- 
fronts both evidence which supports and evi- 
dence which does not support his initial belief, 
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the contrast/surprise model provides a mixed 
evidence model.’ The mixed model predicts a 
recency effect because the adjustment weight 
for any piece of evidence depends on the cur- 


- rent belief or anchor, which is a function of the 


previous evidence. The effect of positive evi- 


` dence is greater when the anchor is small (Le. 


the initial belief is weak), producing a surprise 
effect. The higher one’s initial belief, the smaller 
the effect of positive evidence. On the other’ 
hand, as initial position decreases, the impact of 
negative evidence decreases. Because the 


weight given to new evidence depends on the 


most recent adjustment (Le. a new anchor posi- 
tion is based on the most recent information), 
the model predicts a recency effect. 

Several researchers have tested the contrast/ 
surprise model in accounting contexts. In a 
series of experiments, Ashton & Ashton (1987) 
gave auditors an initial likelihood that an internal 
control system would prevent or detect material 
errors and then gave them additional evidence 
to use in updating their beliefs. Their results 
were consistent with the contrast/surprise 
model: auditors’ judgments demonstrated a re- 
cency effect. 

Messier et al. (1987) used two different audit 
scenarios to test several predictions of the con- 


'This process has been addressed in many areas including probabilistic inference, decision theory, attitude change and causal 
inference. For references, sec Einhorn & Hogarth (1985), p. 3. 


*The original model was introduced as the contrast/surprise mode! (Einhorn & Hogarth, 1985). In a subsequent revision, it 
was renamed the contrast/inertia model (Einhorn & Hogarth, 1987). The predictions in this research are consistent with 
either formulation of the model. 


*That model is defined as follows: 

Sq = Sy_1 — Sy 1 ah (for negative evidence) 

Sp = Sa—1 + (1 — Sy-1) b$ (for positive evidence) 
where 

S, = strength of belief after evaluating k pieces of evidence (0 & 5, £ 1) : 

S,-, = strength of belief after evaluating k— 1 pieces evidence, i.e. the anchor before evaluating the k” piece of evidence (0 
<5,<1) 

a, = strength of the &” piece of negative evidence 

b, = strength of the k” piece of positive evidence 

a = represents an individual's attitude toward negative evidence (a = 0). (The larger the a parameter, the /ess the revision 
due to negative evidence.) ; 
B = represents an individual’s attitude toward positive evidence (B = 0). (The Jasper the B parameter, the /ess the revision 
due to positive evidence.) 
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trast/surprise model. In their experiments using 
mixed evidence, the results were also consistent 
with the model’s predictions. There were signifi- | 
cant order effects: the order of evidence of two 
positive information items followed by two. 
negative items resulted in greater downward re- 
visions of initial likelihood judgments than the 
order of two negative items followed by the 
positive evidence. 


Confirmation bias in updating beliefs 

Psychology research has suggested that 
human reasoning is “prone to a ‘confirmation 
bias’ that hinders effective learning” (Klayman & 
Ha, 1987). There is evidence that in many con- 
texts people tend to search for or give more 
weight to evidence which supports an initial 
hypothesis. The use of this sort of hypothesis- 
testing strategy has been found by both psychol- 
ogy and accounting researchers. For example, 
Lord et al (1979) found that subjects used diffe- 
rent standards for criticizing opposing evidence 
than those used for criticizing supporting evi- 
dence. Subjects tended to discredit information 
counter to the hypothesis they supported. Other 
studies have uncovered confirmation tenden- 
cies in the area of social inference (Snyder & 
Swann, 1978), rule discovery tasks (Wason, 
1960) and judgments of contingency (Alloy & 
Tabachnik, 1984). 

While most of this work has been in the 
psychology and social psychology fields using 
students as subjects, some accounting resear- 
chers have also addressed the topic. For 
example, Waller & Felix (1984) discussed sev- 
eral cognitive aspects of the auditor’s judgment 
process, one important characteristic being that 
“the auditor manifests a strong tendency to seek 
and use confirmatory rather than disconfirmat- 
ory evidence” (p. 399). In an empirical investi- 
gation, Kida (1984) examined whether hypo- 
thesis-testing strategies employed by practicing 
auditors affected their use of judgment data. In 
an experimental task dealing with a going-con- 
cern decision, Kida framed the problem as one 
about a firm’s probable failure for one treatment 
and as one about the firm’s viability for the other, 
treatment. Both groups were then given the 
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same information about the company. While 


there was no difference between the two 
groups’ probability judgments, there was a sig- 
nificant difference in the number of viable items 
listed by the subjects as relevant. The subjects in 
the viable condition listed more viable informa- 


.tion items than did the subjects in the failure 


condition. These results provide limited support 
for the use of confirmatory strategies by au- 
ditors, but Kida suggested that the effect might 
be more pronounced in situations where infor- 
mation is received sequentially by auditors. 

Kida (1984) also suggested that prior beliefs 
may lead to more reliance on confirmatory 
strategies than the hypothesis framing he used in 
his study. Consideration of both the confirma- 
tion—bias research and the order effects pre- 
dicted by the contrast/surprise model provides a 
pair of competing hypotheses that we test in this 
research. First, the contrast/surprise model pre- 
dicts a recency effect. 


Ha: Judgments of subjects who receive negative evi- 

dence after positive evidence will be lower (more nega- 
. tive) than the judgments of subjects who receive the 

negative evidence before the positive evidence. 


On the other hand, if an auditor’s hypothesis- 
testing strategy is confirmation-seeking, the in- 
formation which supports his/her initial belief or 
hypothesis should have more impact on judg- 
ments than information which is inconsistent 


‘with his beliefs. ` 


Restated as a competing hypothesis: 


H,,: Subjects instructed to confirm their high prior be- 
liefs will give higher judgments than those instructed to 
disconfirm their prior beliefs, Subjects instructed to con- 
firm their low prior beliefs will give lower judgments 
than those instructed to disconfirm their prior beliefs. 


Prior beliefs as a variable of interest 

Our hypotheses can be applied to scenarios of 
varying prior beliefs. In other words, we are test- 
ing the competing predictions of the contrast/ 
surprise model and hypothesis-testing strategy 


‘literature without specific predictions about the 


effects of the level of prior beliefs. 
However, prior beliefs are important for sev- 
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eral reasons. First, the contrast/surprise model, 
as well as other models and theories that apply to 


updating beliefs, relies heavily on prior beliefs as 


an input in the model. Initial beliefs provide an 


anchor from which decision makers adjust these: 


beliefs when presented with additional evi- 
dence. Second, the weighting of prior beliefs has 
been a subject of a number of studies including 
base rate studies (see, for example, Joyce & Bid- 
dle, 1981) and anchoring and adjustment 
studies (see, for example, Butler, 1986). Al- 


though the level of the initial beliefs will neces-- 


sarily have an effect on the final judgment since 
it provides the initial anchor, these studies do 
not suggest different rules of evidence integra- 
tion for various levels of initial beliefs. For 
example, the predictions of our hypotheses re- 
main the same across varying levels of prior be- 
liefs. If a subject starts with either low or high 
priors, our hypotheses predict that either the 
most recent new information will get the most 
weight (contrast/surprise model) or the infor- 
mation that confirms the priors will get the most 
weight in the belief revision process. ; 
In our study, the definition of confirming and 
disconfirming hypothesis-testing strategies does 
depend on prior beliefs. It is the initial belief that 
defines what will be confirming evidence and 
what will be disconfirming evidence. We 
selected two scenarios that should induce what 
we consider high and low initial beliefs about an 
internal control system. In the high priors condi- 
tion, additional positive information about the 
system will then be confirming. In the low priors 
condition, additional negative information will 
be confirming. i 
While we did not set forth specific hypotheses 
related to high and low priors, we manipulated 
the priors in order to give us the potential to 
generalize our results. For example, if we had 
negative information that was always discon- 
firming, it would not be possible to differentiate 
between the explanation that auditors tend to be 
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disconfirming or that they simply weigh nega- 
tive evidence more heavily. 


METHOD 


Procedure 

Each subject received a booklet containing 
general instructions, several experimental tasks, 
and a postexperimental questionnaire. First, all 
of the subjects read a page of background infor- 
mation about a manufacturing firm. This infor- ` 
mation provided very general facts about the 
firm’s internal controls and past audit record. 
The information was given to enable subjects to 
form a hypothesis about the internal control sys- 
tem, rather than giving them a specific prior 
likelihood value.‘ Following this were ten pieces 
of information about specific internal controls, 
five positive and five negative. Ina 2 X 3 X 2 
(Priors X Hypothesis-testing Strategy X Order) 
design, we first varied the background material 
to create two levels of prior beliefs. One treat- 
ment consisted of a positive description of the 
company, indicating that the company had re- 
ceived a positive evaluation of internal controls 
in the past. The other treatment consisted of a 
negative description, citing past trouble with in- 
ternal controls. All introductory comments 
were general; no specific controls were men- 
tioned. With respect to the second independent 
variable, hypothesis-testing strategy, we varied 
the instructions given immediately after the in- 
troduction, but before the individual pieces of 
data, in order to create three levels of hypo- 
thesis-testing strategy: confirming, disconform- 
ing, and neutral. Subjects assigned to the con- 
firming condition were given the following in- 
structions: 

In order to support your hypothesis about the strength of 

the internal control system, you will need supporting evi- 


dence. In your effort to confirm your beliefs, you have 
gathered the following information: 


Subjects assigned to the disconfirming condition 
were given these instructions: 


‘Other researchers have simply assigned a prior likelihood to subjects. In a more realistic fashion, we have allowed auditors 
to form their own priors based on the scenario we presented. There is a stream of psychology and organizational behavior 
literature that suggests that people feel differently about decisions/judgments they form themselves than those formed by 


others. See, for example, Staw (1981). 


INFORMATION ORDER AND HYPOTHESIS-TESTING STRATEGIES 


In accordance with GAAS, you have gathered evi- 
dence about the strength of the internal control sys- 
tem. With the appropriate degree of skepticism that is 
part of exercising due professional care, you are al- 
ways looking for evidence to disconfirm your beliefs. 
You have gathered the following information: 


The third treatment was neutral; the instructions 
simply said that the information which was to 
follow had been gathered about the internal con- 
trol system. The information was given in one of 
two orders: five positive items (supporting the 
strength of the internal controls) followed by 
five negative items (pointing out weaknesses in 
the system) or the five negative items followed 
by the five positive items. 

Each subject gave a likelihood judgment, es- 
timating the probability that the internal con- 
trols would prevent or detect material error,’ 
after reading the background material. Then, 
after each set of five items, subjects were asked 
for another likelihood judgment that the con- 
trols would prevent or detect a material error, so 
that each subject gave three likelihood judg- 
ments. 

Following this were several unrelated experi- 
mental tasks. Then, all subjects were given a list 
of the ten information items that had been pre- 
sented earlier. Subjects were asked to rate the re- 
levance of each item with respect to their judg- 
ments of the likelihood that the internal control 
system would prevent or detect material error. 
They were asked to assign 100.to the most rele- 
vant piece of information and a number between 
1 and 99 to each of the others, depending on 
how each compared in relevance to the one as- 
signed 100. 
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Subjects 

One hundred and twenty-three practicing au- 
ditors from five different international account- 
ing firms participated in the experiment. Four 
firms provided subjects at staff training sessions; 
one returned the experimental materials to us 
by mail. The auditors had between three and 
nine years of experience. 


RESULTS 


The results of our experiment are summarized 
in Table 1. The predictions of each theory or 
model are shown, along with our hypotheses 
and the results related to each. 

The first likelihood judgment was given after 
the subject read the background material. This 
was used'as a manipulation check to be sure that 
the two levels of background material, designed 
to induce high and low priors, were perceived as 
different. The mean likelihood judgment for the 
group given the positive scenario was 73.8, 
while the mean likelihood judgment for the 
group given negative scenario was 42.5. These 
were significantly different (t = 8.2; p < 0.001). 

Because of the significant difference in initial 
judgments due to the different background data 
presented, the remainder of the analysis is pre- 
sented in two parts. First, the data and results 
from the subjects who began with the high 
priors are presented, followed by the same 
analysis for the subjects who began with the low 
priors. 


High prior beliefs 

There were 62 subjects who received the high 
priors background information. Three subjects 
were eliminated because their responses were 
inconsistent with the task.° The remaining sub- 


>We framed this question the way we believe the majority of major firms evaluate controls (Cushing & Loebbecke, 1986, p. 
22). Our framing is also consistent with that of other researchers (Ashton & Ashton, 1987), increasing the comparability of 
our results with results from other experiments. However, it would be of interest to frame the evaluation question in a more 
positive way to investigate the potential effect on auditors’ evaluations. 


SBoth the first and the second judgments were used as manipulation checks to see how subjects responded to the original 
scenarios and to subsequent positive and negative information. Using thefirst judgment to check our priors manipulation, 
we eliminated one subject because of an initial likelihood judgment of 0 in the high priors condition. Using the second 
judgment as a manipulation check, we found two other subjects who did not react appropriately to positive and negative data 
(Le. likelihood judgment decreased when receiving positive and when receiving negative data). The analysis that results 
when these subjects are included is consistent with our reported results. 
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TABLE 1. Summary of experimental hypotheses and results 











Model/theory Hypotheses Results 
Contrast/surprise model Receiving both positive and High priors 
negative information will result No recency effect. 
in a recency effect will respect to No support for modeL 
the final likelihood judgment. 
Low priors 
Marginally significant recency effect. 
Support model. 
Hypothesis-testing strategy Evidence which confirms prior High priors 
beliefs will have more impact on Only subjects who received specific 
final likelihood judgment than instructions to confirm their priors 
evidence which does not. weighted the positive information more 


heavily than the negative; disconfirming 
and neutral groups weighted all of the 
information equally. No support for a 
confirming strategy. 

Low priors 

No support for a confirming strategy 

by any subjects. 








jects’ responses were analyzed in a 3 X 2 (Hypo- 
thesis-testing Strategy X Order) ANOVA. The 
final likelihood judgment was used as the depen- 
dent variable because that is the judgment 
which most closely parallels the crucial judg- 
ment in a real decision-making setting. Gener- 
ally, auditors do not give a formal preliminary 
evaluation of an internal control system before 
examining the specific controls. We asked for an 
initial judgment only as a manipulations check. 
Other researchers have simply assigned an initial 
belief to subjects (e.g. Ashton & Ashton, 1987). 
Having subjects come to their own conclusions 
about initial beliefs gives more assurance that we 
have actually captured their priors. This judg- 
ment would then include any preconceived 
ideas about internal control strength that the 
subjects had when they began the experiment. 
Simply assigning an initial belief offers less assur- 
ance that the subjects actually possess that be- 
lief. 

While other researchers have used the differ- 
ence between the initial judgment and the final 
judgment as the dependent variable,’ we are. 
more interested in whether or not the final judg- 





ment reflects a recency effect or a confirmation 
bias. It is this final judgment about the strength 
of internal controls on which auditors rely in 
making decisions about the extent of substantive 
testing required in the audit. 

The ANOVA results, shown in Table 2, indi- 
cated a significant effect due to hypothesis-test- 
ing strategy at p = 0.0048. There was neither a 
significant effect due to the order of the informa- 
tion, nor a significant interaction. As shown in 
Table 3, the mean judgment (final likelihood rat- 
ing that the internal control system would pre- 
vent or detect material error) for subjects in the 
confirming treatment, who received instruc- 
tions to confirm their original hypothesis about 
the internal control system, was 65.2, compared 
to 46.4 for the disconfirming subjects and 40.5 
for the neutral subjects. The Tukey method 
(Winer, 1971) was used to compare the means. 
The mean of the confirming treatment was sig- 
nificantly different from the means of the other 
two treatments at conventional levels (p = 
0.05). The judgments of the subjects in the dis- 
confirming treatment and those in the neutral 
treatment (no instructions) were not signific- 


"Note that using the difference between initial judgments and final judgments is not different from using final judgments 


when subjects are all given the same initial judgment. 
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TABLE 2. Analysis of variance: hypothesis-testing strategy (HTS) X order for high priors subjects 


























Source of Sum of Mean Significance 
variation squares af square F-ratio level 
HTS 6,453.2985 2 3,226.6493, 5.922 0.0048 
Order 500.3113 1 500.3113 0.918 0.3525 
HTS X order 1,895.2146 2 947.6073 1.739 0.1856 
Residual 28,879.188 53 544.8903 
Total 38,003.390 58 
TABLE 3. Mean final likelihood judgments of high priors subjects 
Hypothesis-testing strategy 
Confirming Disconfirming Neutral 

Order 

+4+t++4+---=— 67.5 47.2 30.4 (mean 48.4) 

=-=- +4++4+4+ 62.8 45.5 50.5 (mean 52.9) 

(means 65.2 46.4 40.5) 





antly different at conventional levels. 


Low prior beliefs 
The subjects who started with the low priors 


produced very different results. There were 61. 


subjects in this treatment, 14 of whom were 
eliminated because of inappropriate responses.® 
The remaining subjects’ responses were 
analyzed in a 3 X 2 (Hypothesis-testing Strategy 
x Order) ANOVA, exactly like the analysis done 
with the high priors subjects. As shown in Table 
4, there was a marginally significant main effect 
due to order at p = 0.07, while there was no sig- 
nificant effect due to hypothesis-testing strategy 
at conventional levels. Again, the interaction was 
not significant. Examination of the means, given 
in Table 5, shows a clear recency effect for the 
confirming and neutral treatments and a smaller 
difference between the two treatment orders 
with respect to the disconfirming group. This re- 
cency effect is consistent with the order effect 
predicted by the contrast/surprise model. 


Relevance ratings 

All subjects rated the ten ionmain items 
used in the experiment with respect to their re- 
levance in making their likelihood judgments. In 





Fourteen subjects were eliminated, 12 for an initial likelihood judgment above 50 and two for revising their judgrňenis, woi z i 


a2 X 3 X 2 (Priors X Hypothesis-testing 
Stratėgy X Order) ANOVA using each subject’s 
mean relevance rating for the five positive items, 
there were no significant differences due to 
priors, hypothesis-testing strategy or order at 
conventional levels. Similarly, a 2 X 3 X 2 
ANOVA with each subject’s mean relevance rat- 
ing for the negative items showed no significant 
differences due to any of the independent vari- 
ables. The relevance ratings support neither a 
confirmation bias nor an ordering effect by au- 
ditors. 


DISCUSSION AND FUTURE RESEARCH ISSUES 


Our experimental results provide only partial 
support for the contrast/surprise model. When 
subjects started with low prior beliefs about the 
likelihood that an internal control system would 
detect or prevent material error, their sub- 
sequent updating of beliefs was a function of 
their previous beliefs. This caused a recency ef- 
fect as predicted by the first hypothesis, based 
on the contrast/surprise model. The final mean 
judgment was lower for the subjects who re- 
ceived the negative data last, compared to the 
judgments of subjects who received the positive 
data last. This effect was most pronounced for 


the same direction for both positive and negative data. Removing these subjects had no substantial effect on our, result 


j: re af 
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TABLE 4. Analysis of variance: hypothesis-testing strategy (HTS) X order for low priors subjects 























Source of Sum of Mean Significance 
variation squares df square F-ratio level 
HTS 32.4902 2 16.2451 0.044 0.9575 
Order 1,265.0416 1 1,265.0416 3.387 0.0729 
HTS x order 1,248.9119 2 624.4560 1.672 0.2004 
Residual 15,311.2080 41 373.4441 
Total 17,832.213 46 
TABLE 5. Mean final likelihood judgments of low priors subjects 
Hypothesis-testing strategy 
Confirming Disconfirming Neutral 

Order. 

+++++--~--—- 19.2 34.8 31.0 (mean 28.3) 

=-=- +++++ 45.0 35.0 38.1 (mean 39.3) 

(means 32.1 34.9 34.6) 





the confirming and neutral hypothesis-testing 
strategy conditions. 

However, when subjects started with high 
prior beliefs, their final judgments did not show 
a significant recency effect. The hypothesis-test- 
ing strategy condition did make a difference. 
Subjects who received instructions to confirm 
their original hypothesis gave final likelihood 
judgments which were significantly higher than 
those of the other two hypothesis-testing 
strategy conditions. From the point of view of a 
change from prior beliefs to final judgment, on 
average the confirming treatment subjects did 
not change their beliefs after receiving both 
` positive and negative information. The informa- 
tion seemed to be weighted equally, thus cancel- 
ling out any net effect. The subjects in the dis- 
confirming and neutral treatments did lower 
their beliefs, giving more weight to the negative 
information and less to the positive than the sub- 
jects in the confirming treatment. 

As an auditor goes about gathering evidence 
and making implicit and explicit judgments simi- 
lar to those in this experiment, specific instruc- 
tions are not given each time. Since the neutral 
(no instructions ) treatment may most closely re- 
semble actual practice, we were interested to 
see whether the judgments of subjects in that 
condition were more similar to those of subjects 
in the confirming treatment or to those in the 
disconfirming treatment. In the high priors con- 
dition, the result that the judgments of the neut- 


ral subjects were not significantly different from 
the judgments of the subjects in the disconfirm- 
ing treatment but were significantly different 
from the judgments of subjects in the confirming 
treatment lends support to the notion that au- 
ditors are not, in general, subject to a confirma- 
tion bias. However, if instructions were given to 
engage in a more confirming-oriented strategy, 
it appears that auditors could respond by adopt- 
ing such a strategy. In the low priors condition, 
the recency effect was greatest for the confirm- 
ing treatment, as shown in Table 5. Again, the 
results for the neutral treatment were more simi- 
lar to those of the disconfirming treatment than 
to those of the confirming treatment. Overall, 
this is evidence that auditors are more naturally 
disconfirming. That is, no instructions with re- 
spect to hypothesis-testing strategy results in 
judgments more consistent with a disconfirming 
strategy than a confirming strategy. 

That there were no differences between the 
groups due to any of the dependent variables 
with respect to the relevance ratings of the posi- 
tive and negative data further supports the no- 
tion that auditors are not subject to a confirma- 
tion bias. This measure is similar to one used by 
Kida (1984), who found some weak evidence 
for a confirmation bias in a going-concern task. 
The mean relevance ranking for the five negative 
items was higher for all groups than the mean re- 
levance rating for the five positive items. This is 
consistent with the nature of auditing. While 
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positive evidence is useful to accumulate, it only 
takes one weakness in an internal control system 
to allow errors to take place or to go undetected. 
Thus, the relative importance of a single piece of 
negative evidence appears to be greater than a 
single positive piece of evidence. Kida (1984) 
found similar results in that subjects in both fai- 
lure and viability conditions with respect to a 
going-concern decision attended to more failure 
items. 

While our results do not present a consistent 
or easily described picture of the audit judgment 
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process, the specific conditions under which 


there is an information ordering effect have been 
described in more detail than in past research. 
Further, under a variety of conditions, we found 
no evidence that auditors are prone to a confir- 
mation bias. When these results are considered 
along with prior research on the issue of confir- 
mation bias (e.g. Kida, 1984) and the contrast/ 


surprise model (e.g. Ashton & Ashton, 1987), it 


seems Clear that information processing resear- 
chers in accounting must continually explore 
the limits of the theories we test. 
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COGNITIVE SCRIPTS IN-AUDITING AND ACCOUNTING BEHAVIOR* 


FREDDIE CHOO 
School of Accounting, PURES of New Soutb Wales, Spdney 


Abstract 


The purpose of this paper is to present a major concept from social cognition called “script” and to develop 
the script notion for application to auditing and accounting behavior. Three developments are of primary 
importance in this paper: (1) 2 concern with adapting scripts to the study of auditing and accounting 
behavior. The script concept is explored in depth and is shown to have 4 substantial promise; (2) a 
depiction of auditing and accounting as a set of scripted or scriptable situations, which offers a potentially 
alternative perspective of auditing and accounting behavior. Exampies of scriptal situations in auditing and 
accounting are demonstrated; and (3) a provision of some research propositions as useful preludes for 


stimulating further empirical or theoretical work. 


“The use of scripts as a way of organizing data in 
memory may be among the most interesting 
areas for future (accounting) 
(Birnberg & Shields, 1984, p.378). The purpose 
of this paper is to present a major concept from 
social cognition called “script” and to develop 
the script notion for application to auditing and 
accounting behavior. Theoretically based and 
empirically tested concepts in cognitive and so- 
cial psychology are drawn upon to understand 
the underlying cognitive dynamics of behavior 
in auditing and accounting. This objective is ac- 
complished in three main steps: (1) by explor- 
ing the script concept in some depth; (2) by de- 
monstrating scriptal situations in auditing and 
accounting; and (3) by providing some proposi- 


tions for stimulating further empirical or 


theoretical work. 





research” 


THE SCRIPT CONCEPT 


+ 


Scripted behavior 
Recent work from cognitive psychology 


- suggests that schema-based information plays a 


significant role in the enactment of much human 
behavior (Taylor & Crocker, 1981). A schema! 
is “an abstract representation of knowledge 
structures that people use to organize and make 
sense of social and organizational information or 
situations” (Fiske & Kinder, 1981, p.173). 

In functional terms, schemas direct attention 
to relevant information, guide its interpretation 
and evaluation, allow for inferences when infor- 
mation is missing or ambiguous, and facilitate its 
retention. Several schematic constructs that 
shared these common functions had been de- | 
veloped in the past. One of these is “script” 


*I wish to acknowledge the helpful comments and criticisms of Barry Lewis and an anonymous reviewer on earlier versions 
of this paper. The support of a Special Research Grant from the University of New South Wales is appreciated. 


'More precisely, Rumelhart & Ortony (1977, p. 101) define schemas as cognitive representations of generic concepts 
consisting of attributes that constitute the concepts, and relationships among the attributes. This is consistent with Bartlett’s 


(1932, p. 20) original description of a schema as, *... 


an active organization of past reactions, or of past experiences, which 


must be supposed to be operating in any well-adapted organic response.” A more recent definition by Taylor & Crocker 
(1981, p. 91) is that: “A schema is a cognitive structure that consists in part of the representation of some defined stimulus 
domain. The schema contains general knowledge about that domain, including a specification of the relationships among its 
attributes, 2s well as specific examples or instances of the stimulus domain.” 
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$ auditing/accounting situation (Louis, 
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(Abelson, 1976, 1981). Scriptal knowledge 
structures retain knowledge of expected sequ- 
ences of behaviors, actions, and events and are 
concerned both with understanding the be- 
_ havior of self and others and with guiding one’s 
behavior in specific situations or contexts” (see 
e.g. Abelson, 1976; 1981; Graesser et al., 1979; 
Schank & Abelson, 1977). A script is presumed 
to be constructed of more fundamental units, 
termed “vignettes” (Wyer & Carlston, 1979) and 
“scenes” (Tomkins, 1978). A vignette can be 
viewed as the most elemental unit of a given 
script. It is a basic representation of an event and 
can involve individuals, behaviors, or contexts. 
A series of linked or related vignettes form a 
scene, which is a more comprehensive compo- 
nent of the script attached to a situation. Thus, a 
script is composed of a series of scenes made up 
of linked vignettes. Typical examples include 
“going to a restaurant”, “visiting doctors” (Graes- 
ser et al, 1979), and “attending lectures” 
(Nakamura et al., 1985). 


Since scripts are held in one’s memory or 
knowledge structure for understanding events 
and behavior, they provide dual benefits to the 
study of auditing/accounting behavior. One, 
they enable an understanding of a particular 
1980; 
Weick, 1979), and two, they provide a guide to 
appropriate behavior in that situation. Under- 

. Standing a situation involves a search in memory 
to draw on previous situational experiences 
similar to the present one (Schank & Abelson, 
1977). One’s behavior (and the effectiveness or 
consequences of the behavior ) in these previous 
situations then serves as input to a script in the 
memory. This script specifies the behavior likely 
to fit a present situation. 


Some situations encountered in auditing/ 
accounting are predictable, conventional, fre- 
quently encountered and rules-driven and thus 
neatly fit into the description of a generalized 
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script; for example, the normal way one goes 
about testing internal controls in auditing (a 
generalized “internal control evaluations” 
script); performing consolidations in financial 
accounting (a generalized “consolidation” 
script), and setting budgets in management 
accounting (a generalized “budgeting” script). 
Other situations approximate a generalized 
script but differ in details from one instance of 
script performance to another; for example, de- 
termining the timing, extent and scope of com- 
pliance tests within a generalized internal con- 
trol evaluations script, accounting for minority 
interests within a generalized consolidation 
script, and attending an ad hoc decision making 
meeting within a generalized budgeting script. 
Such situations entail some variations of the 
generalized scripts and require some means of 
distinguishing knowledge of these variations in 
memory. 

Variations on a script theme are known as dif- 
ferent “tracks” of a script (Abelson, 1981). Thus, 
for example, a management accountant could 
hold a generalized budgeting script and retain 
specific knowledge about events and behaviors 
appropriate to different types of budgets in diffe- 
rent tracks of this budgeting script. Similarly, an 
auditor could hold a generalized internal control 
evaluations script and retain specific knowledge 
about different criteria for determining the tim- 
ing, extent and scope of compliance tests in dif- 
ferent tracks of the internal control evaluations 
script. This process allows a repertoire of related 
functional tracks to be retained; for example, in 
the case of the generalized budgeting script, one 
for the conduct of an operating budget; another 
for a similar, but distinctive financial budget; and 
yet another for a related capital budget; and in 
the case of the internal control evaluations 
script, one for planning the extent (size) of a 
sample to be tested; another for a closely related 
plan for the timing of gathering the samples, and 
yet another for the scope of testing the samples. 


21 should be noted that other schematic constructs, for example, “template” (Newell, 1979), “prototype” (Cantor & Mischel, 
1977) and “frame” (Minsky, 1975) can also have behavioral implications. What makes script construct unique appears to be 
that it specifies behavioral contingences. In other words, one action or event has implications for the next action or event in 


the sequence. 


COGNITIVE SCRIPTS IN AUDITING AND ACCOUNTING BEHAVIOR 


Proposition 1. Auditors/accountants cognitively retain 
generalized scriptal knowledge of expected sequences of 
auditing/accounting behaviors, actions, and events; such 
scriptal knowledge provides a guide to appropriate be- 
havior in specific auditing/accounting situations. 


Abelson (1981) distinguishes between two 
categories of scripts, weak versus strong scripts, 
that people hold in memory. Weak scripts bear a 
resemblance to other forms of cognitive struc- 
tures such as person prototypes (e.g. extroverts 
or poor performers), which serve to organize 
expectations about the potential attitudes of 


such people. Although weak scripts organize ex- ` 


pectations about the potential behaviors of 
others and oneself, they do not specify the exact 
sequence of these behaviors. “For example, a 
typical circus performance presents trapeze ar- 
tists, a lion tamer, jugglers, and so on, but there 
is no necessary order to the various acts. Still, it 
seems appropriate to refer to the ‘circus’ script 
in the weak sense of the script.” (Abelson, 1981, 
p.717). ; 

Weak scripts apply most obviously to problem 


solving meetings in management accounting or’ 


perhaps solving a potential conflict of interests 
between the auditor and management. Under 
those circumstances, one knows what might 
happen, in general, but cannot predict a specific 
order. This situation occurs not because of a lack 
of prior scriptal knowledge (otherwise there 
would not be any anticipated event) but because 
the script has many complex “tracks” (Abelson, 
1981), but exactly which events will occur and 
in what order they will occur cannot be 
specificed a priori. For example, in auditing, a 
confirmation of a suspected irregularity fits 
neatly into this category. The auditor could hold 
a generalized “confirmation of a suspected 
ircegularlity” script and retain knowledge about 
audit behaviors and procedures appropriate to 
many different “tracks” of the scripts. Different 
“tracks” of the script entail variations in audit be- 
hayiors and procedures. For example, one possi- 
ble “track” would involve, among other things, a 
discussion with the audit committee which 
` might lead to a disclaimer of opinion. Another 
“track” would involve a query with the chief ac- 
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countant which might result in issuing a quali- 
fied opinion. Since there are many potential 
“tracks” to the generalized “confirmation of a 
suspected irregularity” script, the auditor can 
never be sure which “track” will take place. _ 
Furthermore, within each “track”, it is difficult to 
predict the specific sequence of events happen- 
ing. For example, a particular “track” might con- 
tain events such as inquiring the client about the 
materiality of the suspected account, informing 
the Board of Directors, seeking outside legal ad- 
vice and so on. The exact sequence and the 
probable occurrence of those events cannot be 
easily anticipated a priori. 

Strong scripts, on the other hand, contain ex- 
pectations not only for the exact occurrence of 
events, but also for the progressive sequence of 
the events. In Abelson’s (1981, p.717) words, 
“The distinctive aspect of strong scripts is the re- 
levance of learned associations between prior 
and consequent events.” Strong scripts are re- 
served for stereotypical and ritualistic occa- 
sions. Under those circumstances, one knows 
what will happen, and the order in which it will 
happen. “In the strongest sense of a totally 
ritualized event sequence (e.g. a Japanese tea 
ceremony), script predictions become infallible 
— but this case is relatively rare” (Abelson, 
1981, p.717). There would be minimum varia- 
tions, that is, few “tracks” associated with a 
generalized strong script. 

The drawing up of an audit plan, for example, 
falls into the category ofa strong script. Here, the 
audit planning script is associated with a very 
familiar or ritualistic auditing “track”. This situa- 
tion occurs because the events of an audit plan- 
ning “track” often follows a fixed order of audit 
behaviors and procedures that are prescribed by 
a standard checklist. The auditors know very 
well the specific track involved and they know 
the progressive sequence of events a priori. 


Proposition 2. Both weak and strong scripts are retained 
by auditors/accountants. Weak scripts are associated 
with situations in which potential events are expected 
but the order of events is not predictable a priori. Strong 
scripts are associated with situations in which specific 
events as well as the order of events are predictable a 
priori. 


484 


One notable characteristic of a strong script, 
in contrast to a weak script, is its nature of repeti- 
tiveness. Through prolonged exposure to the 
same ritualistic script, an individual builds up a 
repertoire of events and order of events as- 
sociated with that script. Over time, an indi- 
vidual has a better (more accurate) memory ofa 
strong script than a weak script (Abelson, 1981) 
because the strong script has become “unitized” 
(Hayes-Roth, 1977). According to Hayes-Roth 


(1977), repeated experiences strengthen the as- 


sociative connections and configurations among 
scriptal actions and events. With additional ex- 
perience, these relational links become 
stronger. Ultimately, the associations and config- 
urations may be strengthened to the point of 
“unitization”. At that point, scriptal actions and 
events have progressed from a collection of in- 
dependent but related parts to a single, integ- 
rated memory representation. A strong script 
that gets “unitized” will be better remembered 
because it will be processed wholistically or less 
consciously. 


Proposition 3. Auditors/accountants will exhibit a more 
accurate memory recall for strong scripts than weak 
scripts. 


A more accurate memory of a strong script is 
also manifested in an individual’s script enact- 
ment (to be elaborated below); that is, the resul- 
tant scripted behavior will be more automatic. 
Thus, the auditors are expected to be capable of 
attending to every detail of the ritualistic audit 


plan and effortlessly and efficiently (with less. 


conscious processing) executing the audit plan- 
ning procedures (with more automatic scripted 
behavior). On the other hand, a weak script will 
require more conscious script processing and 
less automatic behavior. In other words, au- 
ditors are not expected to have developed the 
confirmation of a suspected irregularity script to 
the extent that they can process and execute it 
unconsciously and automatically. 
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Script enactment 
Abelson (1981) argues that several conditions 
must be present for scripted behavior to occur: 


1.A person must retain a cognitive representation of a 
particular script. 

2. A context or situation eliciting script must be experi- 
enced. 

3. The person must enact the script. 


Abelson’s actual phrase is that the person must 
“enter the script”, but this phraseology does not 
seem to capture the behavioral dynamics as- 
sociated with the process. Thus the phrase 
“enact the script” has been chosen because it 
better represents the translation of cognitive 
structure and process into behavior. Two other 
common terms, “instantiate” and “invoke”, are 
not used here because the former may get con- 
fused with the term “instantiated script” used by 
Graesser et al. (1979) to denote a partial copy of 
a generic script (to be discussed later). The lat- 
ter is not used because it is closer to the second 
condition for scripted behavior whereby an 
evoking context or situation must be present to 


invoke the scripted behavior. The concept of 


script enactment is further elaborated here. 

As mentioned above, scripts represent cogni- 
tive retention of context specific knowledge of 
common or conventional behavior and event 
sequences. The performance of behaviors stem- 
ming from the cognitively held scripts is labeled 
“script enactment”. A clear distinction must be 
drawn between script enactment and script pro- 


‘cessing. Script enactment refers to the proces- 


sing of cognitive scripts that have behavioral im- 
plications. The processing of cognitive scripts 
that have no (or not easily observable) be- 
havioral implications does not come under the 
ambit of script enactment. For example, proces- 
sing a mathematical script? for the sole purpose 
of solving an equation will only be described as 
script processing but not script enactment. In 
the case of script enactment, another clear dis- 
tinction must be made between conscious or 


Conceptually, a mathematical script can be viewed as a script that describes a set of generalized hierarchical mathematical 
arguments (statements) for solving a specific mathematical equation or problem. Eylon & Reif (1984) and Reif & Heller 
(1982) illustrated some mathematical scripts that could be used to solve mathematical problems in physics. 
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unconscious processing of scriptal information 
that results in automatic or less automatic be- 
havior. Not so well developed scripts and weak 
scripts will require more conscious script pro- 
cessing while fully developed scripts and strong 
scripts will require less conscious or uncon- 
scious processing as the latter scripts have be- 
.come unitized or wholistically processed. 
Some authors have maintained that scripted 
. behavior is more or less automatic (uncon- 
sciously executed) (Feldman, 1981; Schneider 
& Shiffrin, 1977; Abelson, 1981; Langer, 1978). 
For example, Langer’s (1978) use of scripts to 
explain “mindless” behavior portray scripted be- 
havior as essentially automatic in nature. The 
position taken in this line of argument is that in- 
dividuals do not necessarily actively process 
scripts anew in order to decide how to behave. 
Rather, they frequently can depend on personal 
or consensual schemas to understand and re- 
spond to situations with relatively little active 
script processing. The schemas that would be 
used specifically for such behavior are well 
developed scripts. Thus the processing of a well 
developed script, which entails less conscious 
effort, would tend to result in an automatic be- 
havior for a given situation. 

However, it seems that the treatment of the 
scripted behavior only as an automatic process is 
needlessly restrictive. People often consciously 
develop and monitor their scripts in a purpose- 
ful manner — for instance, to satisfy their needs, 

. preferences, or self interests or simply to create 
desirable impressions (Klein & Ritti, 1980; 
Snyder, 1974, 1977). As Goffman (1959) notes, 
where people are aware that certain situations 
require specific behaviors, for example, in situa- 
tions where it is necessary to bypass or create 
new accounting rules, standards and policies, 
scripted behaviors are not always spontaneously 
or unconsciously executed. Rather, people have 
the ability to reflect on what they are doing. 
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Since people do reflect on what they are 
doing, it is logical to expect that they engage in 
a mixture of conscious and/or unconscious 
script processing. Novel situations, for example, 
assignment to a new audit client, following a 
new mandatory. accounting method, or appoint- 
ment to a newly created management account- 
ing position, require intensive conscious proces- 
sing to decide appropriate events and behaviors. 
Such conscious efforts are directed largely at ` 
searching for the appropriate scriptal events and 
behaviors and they reflect script processing 
while. the subsequent strengthening of these 
founding scriptal events and behaviors reflects 
script development (to be elaborated later). On 
the other hand, familiar or stereotypical situa- 
tions can be handled with little or no conscious 
processing. They can be characterized as au- 
tomatic script processing (mindless perform- 
ance of overlearned behaviors). Between these 
two extremes are events and behaviors requir- 
ing progressively less active processing as situa- 
tions become increasingly conventional, repeti- - 
tive, and stereotypical. 

Consistent with the terminology of script 
enactment, conscious script processing tends to 
result in less automatic scripted behavior while 
unconscious script processing tends to result in 


automatic scripted behavior. It is suggested that 


script enactment, in the sense of consciously or 
unconsciously processed scripts that drive the 
less automatic or automatic execution of 
scripted behaviors, might affect a significant 
proportion of behaviors in auditing/accounting. 


Proposition 4(a). Auditors/accountants will process 
weak scripts more consciously and this will result in less 
automatic scripted auditing/accounting behaviors. 


Proposition 4(b). Auditors/accountants will process 
strong scripts less consciously and this will result in more 
automatic scripted auditing/accounting behaviors. 


‘it is to be noted that this is quite different from some situations where script processing is bypassed. For instance, a situation 
might be so drastically different from existing scripts that no adequate behavioral knowledge exists to deal with it (e.g. an 
auditor is just told that he is auditing the wrong client, or everybody on the financial accounting staff refused to cooperate 
with one another). A second instance occurs when an accountant must make a large number of similar decisions in a short 
time period, each based on a different configuration of information. In this case script processing is likely to be bypassed in 
favor of response programs and mechanistic application of rules to deal with the information (Abelson, 1976). 
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Script development 


The essence of the script concept is the asser- 


tion that people possess cognitive representa- 
tions of common events or event sequences. 
This. stored knowledge is called into play 
whenever situational cues evoke an expectation. 
for certain events to occur. This assertion raises 
a fundamental question: how are such scripts ac- 

_ quired? 
Scripts can be acquired by both direct and in- 
direct means. Direct script acquisition includes 
' interaction experience with other people, 
events, or situations. This experience tends to 
initiate a script development process. For in- 
stance, auditors learn and internalize the scripts 
of behavior during an analytical review process 
by actually experiencing events that take place 
during such a review process. Repetition of the 
experience serves to solidify the script. For in- 
stance, a junior accountant repeatedly learns the 
sequence of behaviors or actions needed to pre- 
pare a fund statement by trial and error. Such 
` repetitions help to formalize the scripts for ex- 
pected behavior in another similar situation. In 
the developmental stage of a script for behavior, 
reward and reinforcement are important proces- 
ses for learning the behaviors that should be in- 
corporated into the structure of the cognitive 
script. New staff members, in particular, are ac- 
tively engaged in a sense-making process, trying 
to learn which behaviors are appropriate for 


which situations (Louis, 1980). Indirect script: 


acquisition occurs by, means of oral and/or writ- 
ten communication. Conversations with other 
people communicate expectations for approp- 
riate behavior. Similarly, reading and/or watch- 
ing the scripts portrayed in training films can 
provide good indications of behaviors fitting a 
number of common auditing/accounting situa- 
tions. 

Abelson (1976) proposes that the develop- 
ment of scripted understanding and behavior 
can progress through three evolutionary levels, 
which he terms episodic, categorical, and 
‘hypothetical scripts. An episodic script is ele- 
mental and is retained as a context specific re- 
membrance of a single experience. When a per- 
son experiences many similar episodes in similar 
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types of situations, however, the collection of 
episodic scripts evolves into a categorical script 
— a script appropriate for a relatively narrow 
class of situations. Finally, if or when enough ex- 
perience or learning is acquired and generalized 
across contexts, a hypothetical or generalized 
script is abstracted and serves as a “metascript” 
to guide behavior in a range of related situations. 
Generalized scripts imply the organization of be- 
havioral knowledge into some meaningful struc- 
ture. Thus the evolution from the the primitive 
and specific episodic script to the complex and 
wide-ranging generalized script represents a 
progression from the concrete to the abstract, 
and from the context bound to the general. 
Once evolved, the generalized script serves as a 
functional repertoire whereby a specific script 
that is appropriate for a given situation can be 
tacitly “deduced” and performed. 

Two observations can be made about Abel- 


" son’s (1976) conceptualization of a script de- 


velopment. First, Abelson uses the three discrete 
category terms to describe three evolutionary 
“landmarks” in script development. However, 
script development is a continuum, it should not 
be construed as three discrete category types of 
scripts. Second, the evolution from the elemen- 
tary context specific script to complex context 
free script may be closely linked to the progres- 
sion in one’s expertise in a particular knowledge 
specific domain (Fiske et al., 1983). For ex- 
ample, as novice auditors/accountants acquire 
more work experiences and progress through to 
the senior levels, their auditing/accounting 
scripts for auditing/accounting tasks evolve from 
an elementary context specific to a complex 
context-free form. In this sense, the novice au- 
ditors/accountants gradually advance from 
being novices to experts in their auditing/ac- 
counting domain. 


` Proposition 5. For specific auditing/accounting tasks, 
novice auditors/accountants will be shown to hold 


elementary context specific scripts while expert au- 
ditors/accountants will be shown to hold complex con- 
text free scripts. 
Script recall 
Recall of previously stored script in the long 
term (permanent generic) memory has been 
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hypothesized to occur by means of a script 
pointer plus tag model (Graesser et al., 1979, 
‘1980; Schank & Abelson, 1977). For some in- 
coming stimuli (input), the script pointer plus 
tag model guides a person in identifying a match- 
ing script from the permanent generic memory, 
in interpretating the stimuli, in generating infer- 
ences, and in formulating expectations. An indi- 
vidual is said to comprehend the stimuli once 
this initial script processing is completed and a 
specific memory trace is constructed. According 
to the model, a specific memory trace is con- 
structed by copying a subset of the identified 
generic script that best fits the incoming stimuli. 
Thus what gets stored in a person’s memory is, in 
effect, not the input (the incoming stimuli) it- 
self, but a partial copy (subset) of the generic 
script, commonly referred to as the “instantiated 


script”, that interprets and comprehends the- 


input. The instantiated script is connected to the 
generic script from which it was copied by a 
“pointer”. The instantiated script consists of 
“typical” script events that are part of the input 
and typical events that are not part of the input 
‘but internally inferred — a process called “gap- 
filling” (to be elaborated later). These typical 
events exist as a single memory unit. On the 
‘other hand, “atypical” events that are part of the 
input and inferred atypical events that are not 
part of the input are individually tagged along 
the memory trace. As such, they exist as func- 
tionally separate memory units. To enhance a 
better comprehension of the concept, a dia- 
grammatic representation of the script pointer 
plus tag model is presented in Fig. 1. 

Graesser et al. (1979, 1980), Smith & Graes- 
ser (1981), Bower et al. (1979) and Nakamura 
et al. (1985) have all shown that, in general, 
people tend to recall more accurately atypical 
than typical events. The rational for this 
phenomenon is that since typical and inferred 
typical events exist as a single unit, it would be 
difficult to discriminate between them in script 
recall. Inferred typical events would be assumed 
to be part of the input when, in fact, they are not. 
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Fig. 1. A diagrammatic representation of the script pointer 
plus tag model (adapted from and modified on Graesser, 
1981). 


On the other hand, atypical events exist as func- 
tionally separate units and they are easier to dis- 
criminate. 

Besides Graesser et al.'s (1979) script pointer 
plus tag explanation, there is also an elaboration 
explanation that explains the memory for typi- 
cal/atypical (congruent/incongruent) events 
(Hastie, 1980, 1981; Hastie & Kumar, 1979; Has- 
tie & Mazur, 1978; Srull, 1981; Hamilton & Gif- 
ford, 1976; Hamilton et al., 1980; Johnson & 
Judd, 1983). Hastie (1980) proposed a depth-of- 
processing-network associational model to ex- 
plain the relative memorability of congruent and 
incongruent information. Note that the compar- 
able term, congruent versus incongruent infor- 
mation, is used in this paradigm instead of typical 
versus atypical events. Hastie’s model was de- 


‘For example, a typical event in an “attending a lecture” script is “the lecturer writes on the blackboard”. On the other hand, 
an atypical event in the same script is “the lecturer sips a cup of coffee”. 


488 


veloped based on the concept of human associa- 
tive memory suggested by Anderson (1972) and 


Anderson & Bower (1973) and with reference 


to a levels-of-processing framework (Craik & 
Lockhart, 1972). 

According to Hastie’s depth-of-processing- 
network associational model, incongruent infor- 
mation is salient or attention catching. People 
spent more time and (perhaps) invoked more 
complex perception and comprehension strate- 
gies to account for incongruent information. 
Thus, incongruent information is processed 
more extensively and perhaps more deeply than 
congruent information, resulting in richer, more 
durable, and more retrievable memory traces. 
Technically, Hastie’s depth-of-processing-net- 
work associational model suggests that the prob- 
ability of recall of an item is a function of the 
number of linkages it has to other items. Link- 
ages are established between any two items 
when they make contact in working memory. 


. For two items to become linked, they must not ` 


merely cohabit working memory, but also, they 
must be involved in a process Hastie considers 
analogous to Craik & Lockhart’s (1972) elabora- 
tive processing. Highly incongruent items are re- 
membered well because they require elabora- 
tive processing, including causal attribution, to 
be understood. In the course of this reasoning 
about incongruent information, they become 
extensively linked to other pieces of information 
about the stimuli and therefore are easier to re- 
call. 

Both Graesser et al.’s (1979) script copy plus 
tag model and Hastie’s (1980) depth-of-proces- 
sing-network associational model for memory 
recall would predict that atypical (incongruent) 
scriptal events would be more accurately. re- 
membered than typical (congruent) scriptal 
events. This is often referred to as the “typical- 
ity” effect. Auditors/accountants often have to 
consider both typical and atypical events as- 
sociated with certain tasks. For example, “sales 
have fallen substantially in the past few years” 
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will signal a typical event associated with audit- 
ing a client that has a going concern problem. On 
the other hand, “despite sales having fallen sub- 
stantially in the past few years, there has been no 
past operating loss” will signal an atypical event. 


Proposition 6. Auditors/accountants will exhibit a more 
accurate memory recall for atypical scriptal events than 
typical scriptal events associated with specific auditing/ 
accounting tasks, 


Gap filling 
Hastie’s model suggests that gap-filling occurs 
because there is an apparent lack of memory for 


the less deeply processed congruent events than 


the more deeply processed incongruent events. 


‘Further, Graesser’s model suggests that gap-fill- 


ing takes place because there is an apparent con- 
fusion between the typical events of an input and 
internally inferred typical events that are not 
part of the input. 

The concept of gap-filling needs further 
clarification. A clear distinction must be made 
between two aspects of gap-filling: the quantity 
of gap-filling and the nature of gap-filling. The 
quantity of gap-filling refers to the number of 
events people gap-filled to make up for the fogot- 
ten events. For example, a subject may be given 
a script containing ten events to remember. In a 
later recall (or recognition) test, that person 
may recall six of the original ten events and may 
gap-fill three of the four forgotten events. In the 
extreme case of perfect memory, the subject can 
remember everything and-therefore there will 
be no need for gap-filling. Conversely, in the case 
of poor memory, there will be more forgotten 
events and therefore there will be a higher oc- 
currence of gap-fillings. Accordingly, the quan- 
tity of gap-filling bears an inverse relationship to 
the quantity of recall.® 


Proposition 7 (a). The quantity of gap-filling by auditors/ 
accountants will bear an inverse relationship to the quan- 
tity of recall associated with specific auditing/accounting 


scripts. 


This can be presented as: (quantity of recall) + (quantity of gap-filling) = total quantity remembered. It could be further 
expanded as: (quantity of accurate recall + quantity of inaccurate recall) + (quantity of accurate gap-filling + quantity of 
inaccurate gap-filling) + (quantity forgotten) = total quantity remembered. 


COGNITIVE SCRIPTS IN AUDITING AND ACCOUNTING BEHAVIOR 


So far, the quantity of gap-filling has attracted 
little attention in the psychology literature. 
However, it will be seen later that this first as- 
pect of gap-filling deserves more consideration 
in advancing the concept of weak versus strong 
script. 


Psychology literature on gap-filling appears to 
concentrate exclusively on the nature of gap-fill- 
ing (Brewer et al., 1981; Graesser et al., 1979, 
1980; Cantor & Mischel, 1977). The nature of 
the gap-filled events refers to the characteristics 
or quality of these events. Research in category 


theory’ (Rosch et al., 1976, Cantor & Mischel,- 


1977; Mandler, 1979; Taylor, 1981) has 
suggested that the nature of the gap-filled events 
are driven by the prototypicality of the original 
script. Thus “chairs” as a category of furniture 
will be used for gap-filling in a “furniture” script 
rather than in, say, a “meeting” script. In this situ- 
ation, people gap-fill events they did not re- 
member, but which they think must have been 
associated with the original script. Here, the gap- 
filling is not automatic, but is based on the likeli- 
hood of a given event as part of the scripted be- 
havior (I do not remember seeing this but it was 
probably there). 


Research by Graesser et al. (1979) has also 
shown evidence that people tend to gap-fill typi- 
cal events. As noted earlier, Graesser et al.’s 
script pointer plus tag model predicts that typi- 
cal events, in comparison to atypical events, are 
held as a single unit in the memory trace. As 
such, people confuse typical events that are part 
of the input and typical events that are not part of 
the input but are inferred. Here, the gap-filling is 
automatic, and the individuals think the events 
were present in the original script (I definitely 
remember this). 
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Proposition 7(b). The nature of gap-filling by auditors/ 
accountants will bear a direct relationship to the pro- 
totypicality or typicality of recall associated with specific 
auditing/accounting scripts. 


The quantity of gap-fillings is expected to be 
different for weak and strong scripts. In the case 
of a strong script, for example, in an audit plan- 
ning script, strong memory is expected due to 
the repetitive performance of the very familiar 
and ritualistic audit planning procedures. The 
variations in the audit procedures (events) are 
few and predictable. For example, the basic 
events of the auditing planning script would in- 
volve deciding the suitability of the client, ob- 
taining background information, assessing risk 
and materiality, preliminary evaluation of inter- 
nal controls, and documenting the list of plan- 


ned testing procedures. The progressive sequ- 


ence of events is also very much fixed. Since au- 
ditors are expected to remember well a strong 
script, they would exhibit a low quantity of gap- 
fillings. Indeed, if the audit planning script is 
very strong, there will be a very minimum quan- 
tity of gap-fillings. In comparison, the auditor is 
expected to forget more events in a weak script 
such as the confirmation of a syspected 
irregularity script. The auditor has difficulty re- 
membering it because it has many potential 
tracks, and the procedures, as well as the sequ- 
ence of the procedures, within each track can- 
not be precisely anticipated a priori. 


Proposition 8&(a). Auditors/accountants will gap-fill less 
quantity of events in strong scripts than weak scripts. 


Research by Graesser et al. (1979) and Bower 
et.al. (1979) suggests that the nature of the gap- 
filled events will lean towards those that are pro- 
totypical/typical to the original script. Gap-fill- 


TRosch & Mervis (1975) view natural language categories as consisting of complex networks of overlapping features. 
According to this view, differences in subjects’ processing of typical and atypical items are attributed to typical category items 
having more features in common with one another than atypical items. For the sake of distinctiveness, categories are 
represented cognitively by prototypes — actual or imaginary instances of the category that contain attributes most 
representative of items inside the category and least representative of items outside the category. Once a prototype of a 
category has been formed, membership in the category is assessed in terms of “prototypicality” or perceived similarlity to the 
prototypical instance. Abelson (1981, p. 725) made an analogy between category and script, “If one regards each specific 
realization of a script (each dental visit, say) as an object, and the events transpiring during the episode as attributes, then 


one can regard a script as a psychological category.” 
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“ing of events which are prototypical/typical of a 
script is likely to apply to auditors recalling 
either a strong or a weak script. In other words, 
irrespective, of the: quantity of gap-fillings, the 
nature of the gap-filled events are more likely to 
be typical than atypical. According to Graesser et 
al’s (1979) script pointer plus tag model, atypi- 
cal events are individually tagged onto the mem- 
ory trace as functionally separate units and 
therefore they would be éasier to discriminate. 
Further, a ‘strong script, in contrast to a weak 
script, has fewer tracks (Abelson, 1981) and the 


events within these tracks are more prototypical ` 
(Rosch & Mervis, 1975). It follows that the gap- ` 


filled events in the audit planning script will be 
judged more prototypical/typical than gap-filled 
items in the confirmation of a suspected irregu- 
larity script. Consequently, there is a higher oc- 
currence of gap-filled prototypical/typical 
events in a strong script than a weak script. 


- Proposition 8(b). Auditors/accountants will gap-fill 
more prototypical/typical events jo prong scripts than 
weak scripts. 


ISSUES ASSOCIATED WITH CONDUCTING 
AUDITING/ACCOUNTING RESEARCH ON 
l SCRIPTS f 


Initial script research should focus on the 
above propositions in auditing/accounting. 
Some issues that.are specifically associated with 
‘conducting auditing/accounting research in’ 
script are discussed below. 


Presentation of experimental scripts 


There are at least two strategies for presenting : 


experimental scripts to auditing/accounting 
subjects. One is to give them only a script head- 


ing (cue) and ask them tò articulate (verbal pro- - 


tocol approach) or to write (written protocol 
approach) about the events that are invoked by 
the given script heading. Here the main research 
objective is to demonstrate and establish that au- 
ditors/accountants hold certain scripts. It is 
therefore important that the experimenter does 
not create, introduce or manipulate the content 


_ of the script a priori, but simply extracts or mea- 
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sures the existence of the scriptal events from 
the subjects’ verbal or written protocols. The 
second strategy involves the experimenter pro- 
viding and/or manipulating a script (not just a 
script heading). Here, the main objective is to in- 
vestigate the effects of the inclusion of various 
events in that script on the subjects’ verbal or 
written protocols. It is therefore important that 
the experimenter has a priori evidence that au- 
ditors/accountants hold such script and seeks to 
= invoke the scripted behavior of interest from the 
» subjects by introducing or eae the es- 
tablished scriptal events. 

The first strategy is often idi in a novel siti 
tion where the content, and indeed the exis- 
tence of a script, is unsure. of or. cannot be 
specified a priori. The second strategy is often 
used in situations where an apriori generalized 
script can be reasonably assumed to be posses- 
sed or already learned by the subjects (e.g. 
Schank & Abelson, 1977; Graesser et al., 1979; 
Bower et al., 1979; Light & Anderson, 1983; 
- Schmidt & Sherman, 1984; O’Sullivan & Durso, 
1984; Maki & Swett, 1987). The second strategy 
is also closer to the essence of the script concept 
in that people are assumed to have internalized 
scripts through repetitive exposures and experi- 
ence. Since auditing/accounting scripts of in- 
terest are usually about situations where an a 
priori generalized auditing/accounting script is 
either known or can be reasonably assumed a 
priori or is frequently encountered by the ex- 
perienced | auditors/accountants, the second 
strategy is envisaged to be more widely used for “ 
studying scripted behavior in auditing/account- 
ing research. Further, by presenting script 
events a priori, the experimenter can manipu- - 
late and investigate the gap-filling phenomenon. 
It will be difficult, if not impossible, to observe 
which and what events are gap-filled if the first 
EMERY is OPISE: 


` Verbal versus written protocol approach 
-Free verbal (e.g. Bouwman et al., 1987) or 
written protocols are ušeful tools for tapping au- 
ditors’/accountants’ cognitive scripts. In audit- 
ing/accounting, the written protocol approach is 
envisaged to be more widely used because (a) 
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laboratory experiments are usually arranged to 
be conducted on-site, that is, at the auditors’/ 
accountants’ offices or at their in-house training 
centers. Under those circumstances, it is difficult 
to secure a quiet room for the verbal protocol 
approach and even if a reasonable quiet room is 
available, the auditors/accountants may not be 
accustomed to talking aloud their thoughts into 
a tape recorder. On the other hand, they habitu- 
ally write down their thoughts on paper without 
verbalization, for example, writing an audit re- 
commendation letter to the management about 
-~ internal controls; (b) the written protocol ap- 
proach by-passes the necessity for regurgitating 
the subjects’ verbal protocol from the tape re- 
corder onto papers for subsequent analyses. 
Thus, any error in transposition is minimized, 
and (c) so far, a great majority of the psychology 


literature on scripts used the written rather than ` 


the verbal protocol approach. 


4 


IMPLICATIONS OF SCRIPTS FOR AUDITING 
AND ACCOUNTING PRACTICES 


Scripts aid in the understanding ofauditing/ac- 
counting behavior in practice. For example, 
some decision-making processes might be. 
‘studied in terms of decision script. When a deci- 
sion about a particular audit/accounting task is 
to be made, prior experience relevant to the task 


„is likely to be remembered in script form. The. 


auditor/accountant, based on a recalled decision 
- script, has some structured expectations not 
only about the appropriate process to be used to 
. make the decision, but also about the likely sub- 
sequent events that will result from the decision 
` being considered. Hence, the comparison of the 
current decisional situation to the prototypical 
- script for such decisions acts as a guide to the ap- 
propriate decision-making process and behavior 
(Abelson, 1976; Waller & Felix, 1984). 
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Script-based decision making may not neces- 
sarily result in good decision making as scripts 
appear to be heuristic knowledge structures that 
aid in reducing the cognitive complexity of deci- 
sion-making processes. The scripting of deci- 
sion-making situations has therefore a drawback: 
it can induce a failure to be aware of the fine- 
grained differences that distinguish a current de- 


cision problem because the decision process is 


based on a generalized script, rather than a step- 
‘by-step consideration of the uniqueness of 
events relevant to the present situation. Scripted 
understanding of decision situations might 
therefore lead to inappropriate action. On the 
other hand, scripts provide a vehicle for under- 
standing why many of the subtle nuances of 
problem solving and decision making that ex- 
perienced auditors/accountants are expected to 
consider often appear to be inexplicably bypas- 
sed. This is because auditors/accountants gener- 
ally might not be purely rational information 
processors (Cyert & March, 1963; March & 
Simon, 1958; Simon, 1976). Thus the study of 
script influences might be a useful prelude to im- 
proving the decision-making process itself. 
Future research „that uncovers the structure 
and content of scripts should help clarify the in- 
formation auditors/accountants actually use and 
the organizing or sequencing of events that un- 
derlies auditing/accounting task‘ behaviors. 
Since scripts contain explicit information on the. 
appropriate sequencing of various task events 
and behaviors, well developed ‘scripts may be 
used as a vehicle for training junior auditors/ac- 
countants in coming to grips with the approp- 
riate actions and events associated with specific 
auditing/accounting tasks: Work in the realm of 
psychology has demonstrated the role of scripts 
in comprehending textual and verbal descrip- 
tions of events. There is now a need to orient the 
application of scripts to focus on the study of be- 
havior in auditing/accounting. This paper is in- 
tended‘as a point of departure towards that end. ` 
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Abstract 


This paper describes an experiment investigating whether the most appropriate report format is a function 
of the information needed by the decision maker. The latter is a task characteristic which can be described 
ia terms of the question to be answered with an information presentation. Thirty MBA students were asked 
to answer a set of questions with financial information presented in four forms: line graph, bar chart, pie 
chart, and table. The results indicate that the question to be answered and the forms of presentation 
interactively effect performance, and that no one form of presentation is best in all situations. 


Accountants have long been concerned with de- 
termining what information should be pre- 
sented to decision makers.’ Recently, account- 
ing researchers have been focusing an increasing 
amount of attention on determining how that in- 
formation should be presented. Changing the 
presentation or amount of information is one of 
three basic options which can be used to im- 
prove decision making (Libby, 1981, p. 101). 
Ricchiute (1984) has shown that auditors’ judg- 
ments may be affected by the mode (Le., audit- 
ory, visual, or auditory/visual) of presentation 
and calls for additional research both across and 
within different modes of presentation. Moriar- 
ity (1979) and Stock & Watson (1984) provide 
evidence that financial statement users’ decision 
can be substantially improved by presenting ac- 
counting information in a graphic format. Thus, 
the effects of different methods of presenting fi- 
nancial information is an important area of re- 
search with potentially significant implications 
for accounting. 

Further investigation of the effects of varia- 
tions within the visual mode (e.g., graphic or 





tabular presentation) is a particularly timely 
topic. Developments in information systems 
technology make graphic presentations of ac- 
counting information a practical alternative to 
the traditional tabular presentations (Ives, 
1982). In fact, at least one public accounting 
firm is providing clients with financial state- 
ments in graphic form (Jarett, 1981). However, 
little is known regarding the factors which deter- 
mine the form in which financial information 
should be presented to decision makers. 
Numerous authors, including Dickson et al 
(1986), DeSanctis (1984), Bertin (1983), and 
Ives (1982), have noted that the decision 
maker’s task most likely has a significant impact 
on which form of visual information presenta- 
tion results in the best performance. Blocher et 


- al (1986) have shown that the relative effec- 


tiveness of different forms of presentation may 
be a function of the amount of information 
which is presented to, and must be processed by, 
the decision maker. They state that further re- 
search is needed to investigate “... other types of 
interactions [between] format types [and] task 


*The author wishes to express his appreciation to the members of his dissertation committee, S. Michael Groomer, Les 
Heitger, A. Milton Jenkins and Bill Perkins, and to Linda Lovata and David Ricchiute for their comments on earlier versions 
of this paper. Financial support for this project was provided by the Institute for Research on the Management of Information 
Systems at Indiana University, and by Deloitte, Haskins and Sells. 


1 For a further discussion of this literature see Foster (1986), Ashton (1982), and Anderson (1976). 
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characteristics . . .” (p. 468). The objective of 
- this study is to investigate whether the informa- 
tion which a decision maker wishes to extract 


from an information presentation is a task’ 


characteristic which affects the appropriateness 
of different report formats. The experimental 
task is defined in terms of the questions to be 
answered with an information presentation. Per- 


formance is measured by the time taken to ans- . 


wer the question and the accuracy of the ques- 
tion—answer. 


PRIOR RESEARCH 


The effects of different methods of displaying 
financial data have “. . . received surprisingly lit- 
tle attention from .. . accountants . . . and au- 
ditors” (Libby, 1981, p. 117). Ricchiute (1984) 
investigated the effects of variations in the mode 
(visual, auditory, and visual/auditory) of infor- 
mation presentation on auditors’ judgments, and 
found that the mode of presentation may affect 
auditors’ decisions in experimental settings. 
Other research in accounting has concentrated 
on variations within the visual mode of presenta- 
tion. For example, Moriarity (1979) and Stock & 
Watson (1984) investigated the use of mul- 
tidimensional graphics for the presentation of fi- 
nancial information in a bankruptcy prediction 
task. The findings of both studies indicate that in- 
dividuals can make significantly better bank- 
ruptcy predictions using graphics than they can 
using tabular presentations. Furthermore, the 
subjects using graphics were able to form more 
accurate predictions than those produced by 
statistical models. Blocher et al (1986) 
examined the effects of tabular and color 
graphic reports on the accuracy and bias of inter- 
nal auditors’ decisions. Their subjects were 

asked to make judgments concerning the legiti- 
macy of invoices submitted for payment based 
on the amount and risk associated with the diffe- 
rent types of costs on the invoices. Task com- 
plexity was manipulated by varying the number 
of cost categories on an invoice. Their results in- 
dicate that graphic reports are better for low 
levels of complexity and tabular reports are bet- 
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ter for high levels of complexity. The findings of 
these four studies demonstrate that changes in 
the method of presenting financial information 
affect, and can be used to improve, decision per- 
formance. 

Information systems researchers, statisticians, 
psychologists, and educators have investigated 
the relative advantages of various graphic and 
tabular forms of visual presentation for display- 


‘ing both financial and non-financial information. 
- One set of these studies indicates that graphic 


presentations result in better performance (e.g., 
Lucas, 1981; Benbasat & Schroeder, 1977); 
another set indicates that there is no difference 
between performance with graphic and tabular 
presentations (e.g., Wainer et al., 1982); and a 
third set indicates that tabular presentations re- 


sult in better performance (eg, Lusk & 


Kersnick, 1979). 

The conflicting and equivocal findings of prior 
research investigating the effects of different 
forms of presentation is the result of each study 
suffering from at least one of three weaknesses. 
First, with the exception of Wainer et al. (1982), 
the research was not grounded in a theory spec- 
ifically related to the use of information presen- 
tations. Consequently, there was a failure to 
specify and control variables likely to affect the 
appropriateness of different forms of presenta- 
tion. For instance, as previously noted, few re- 
searchers have recognized that task characteris- 
tics are likely to influence performance. Second, 
most business researchers who have examined 
the appropriateness of different report formats 
used tasks which may have confounded experi- 
mental measurement of the effects of different 
forms of presentation. For example, both Ben- 
basat & Schroeder (1977) and Lucas (1981) per- 
formed experiments in which the subjects par- 
ticipated in a business game. In these studies, 
numerous uncontrolled and potentially con- 
founding variables (such as the subjects’ deci- 
sion models) intervened between the experi- 
mental treatments (i.e., the information presen- 
tations) and the measurement of performance 
(e.g, profits). Third, non-business researchers 
have’ generally used information sets of a low 
level of complexity in their studies (e.g., Price et 
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al, 1974; Wainer et al, 1982). However, the re- 
lative advantages of different forms of visual pre- 
sentation may become apparent only with more 
complex information sets (Wainer et al, 1982). 

This study is grounded in a theory specifically 
related to the design and use of information pre- 
sentations. The experimental task does not con- 
found the measurement of performance, and the 
information set is sufficiently complex to allow 
the relative advantages of different forms of pre- 
sentation to become apparent. 


THEORY AND HYPOTHESES 


A well-developed and tested theory which can 
be used to specify the circumstances under 
which different forms of presentation are most 
appropriate does not exist (Wainer & Thissen, 
1981). Accordingly, Bertin’s (1983) theory is 
used in this study “. . . as a first approximation of 
human behavior .. .” (Libby et al. 1985, p. 213) 
and to identify the variables which are likely to 
affect performance. Wainer & Thissen identify 
Bertin’s as the only complete, albeit rudimentary 
and untested, theory concerning performance 
with different forms of visual presentation. 

An important part of any decision process is 
the acquisition of the information cues which 
are used as input for the decision maker’s deci- 
sion model. Bertin (1983) refers to the process 
of obtaining information cues from an informa- 
tion presentation (ie., a graph or table) as the 
answering of questions. His theory is focused on 
determining the most appropriate form of pre- 
sentation for a given question: 

According to Bertin (1983), performance 
with an information presentation is a function of 
three factors: 

— the information set presented, 
— the question to be answered, and 
— the form of presentation. 

Information sets consist of variables and a 
title, and are described by the number and type 
of variables they contain. Variables are described 
by their type: categorical, ordinal, or quantita- 
tive. Consider, for example, an information set 
whose title is “Stock price, in dollars, for General 
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Motors and General Electric, from 1980 to 
1985.” The stock price in dollars is a quantitative 
variable, the year is an ordinal variable, and the 
company name is a categorical variable. 

The question to be answered defines the infor- 
mation to be extracted from the information 
presentation and, therefore, how the decision 
maker interacts with the information presenta- 
tion. Questions are described in terms of the 
amount of information which must be examined 
to answer them. The more information which 
must be examined to arrive at an answer, the 
more complex is the question. 

The form of presentation (e.g., a bar chart) is 
the manner in which the information is visually 
represented to the decision maker. The number 
and type of variables in an information set limit 
the forms of presentation which can be used to 
effectively display it. For example, information 
sets containing three quantitative variables can- 
not be accurately displayed with two-dimen- 


` sional graphs. 


According to Bertin (1983), the most approp- 
riate form of presentation for a particular ques- 
tion is the one which minimizes the effort — 
which Bertin measures in terms of time — the 
user expends to interpret the relevant aspects of 
the information and obtain an answer to his 
question. Different forms of presentation make 
most apparent different aspects of the informa- 
tion displayed, and questions of different levels 
of complexity pertain to different characteristics 
or relationships within the information. There- 
fore, different forms of presentation are most ap- 
propriate for different questions. It is on this im- 
plication of Bertin’s theory that the current 
study is focused. 

While Bertin (1983) focuses on the efficiency 
of different forms of presentation, proponents of 
graphics assert that graphic information displays 
can be used to increase the effectiveness with 
which information is communicated to a deci- 
sion maker. Lusk (1979), who examined the ef- 
fects of arithmetical transformations on the diffi- 
culty of disembedding question-answers from a 
report, found that the report which makes it 
easiest to answer a question results in the most 
accurate answers. This suggests that Bertin’s as- 
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sertions regarding the circumstances under 
which different forms of presentation are ap- 
propriate can be applied to situations in which 
performance is measured in terms of accuracy as 
well as time.” Accordingly, in this study perform- 
ance is measured in terms of both time and accu- 
racy. 

These propositions lead to the following 
hypotheses: 


H1 — The form of presentation which allows a question 
to be answered in the least amount of time will be diffe- 
rent for questions of different levels of complexity. 

H2 — The form of presentation which results in the most 
accurate answers to a question will be different for ques- 
dons of different levels of complexity. 


The first hypothesis tests whether the most effi- 
cient form of presentation is dependent on the 
level of question complexity while the second 
hypothesis tests whether the most effective form 
of presentation is dependent on the level of 
question complexity. Finding support for either 
or both of these hypotheses would provide evi- 
dence that different forms of presentation are 
best for different tasks and that the most approp- 
riate form of presentation cannot be specified 
without taking into account the question to be 
answered. 


METHOD 
The methodology chosen for this study was a 


laboratory experiment. Thirty MBA students at a 
large midwestern university participate in the 
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experiment one at a time. Abdolmohammadi & 
Wright (1987) argue that for structured tasks 
such as the one in this study the performance of 
students should not differ significantly from that 
of real world decision makers. A full-factorial, 
within-subject experimental design is used; each 
subject receives all twenty experimental treat- 
ments (five questions manipulated over four 
forms of presentation) in a different random 
order. 

During the videotaped instructions the sub- 
ject is told to proceed through the experiment at 
his/her own pace (i.e., there are no time limits), 
but that the speed and accuracy with which the 
questions are answered are equally important.? 
During each of the twenty trials of the experi- 
ment, the subject sees one information presenta- 
tion and one question displayed on a microcom- 
puter screen. The subject’s task is to.answer each 
question using the information displayed at the 
same time, The subject records his/her answers 
using the keyboard and the time taken to answer 
each question is measured unobtrusively by the 
computer, After each question is answered, the 
CRT screen is cleared and the next question and 
information presentation are displayed. 


Variables — 

There is one control variable, two indepen- 
dent variables, and two dependent variables in 
the experiment. The control variable is the infor- 
mation set. The independent variables are the 
form of presentation and the question to be 
answered, The two dependent variables are the 
time taken to answer a question and the accu- 


2 It could be asserted that, at least at higher levels of task complexity individuals might attempt to minimize the effort they 
must expend to answer a question through the use of heuristics. An example of such a heuristic would be truncating the 
search for a question-answer after a satisficing, rather than an optimal, answer bas been obtained. The use of heuristics would 
likely reduce both the time taken to answer a question and the accuracy of the question—answer. Visual inspection of the data 
from the experiment and the results of the statistical analysis indicate that the subjects did not resort to the use of heuristics. 


3 The relative importance of the different dimensions of performance will vary across tasks depending on the decision maker’s 
pay-off matrix. The relative importance of time and accuracy were provided to minimize the possibility that the subjects 
would attempt to infer their importance from the experimental instructions or procedures and confound the results. The 
subjects were told to assign equal weight to each measure because both can affect the appropriateness of different forms of 
presentation and they were considered to be equally important with respect to the objectives and hypotheses of this study. 
As discussed in the Results section, no trade-off between the two dimensions of performance was found and telling the 
subjects that they were equally important does not appear to have affected which forms of presentation were best for each 
question or biased the findings. 
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racy of the question—answer. 


Information set. The information set consists 
of one categorical, one ordinal, and one quantita- 
tive variable (i.e., time series data consisting of, 
for example, four companies’ profits for eleven 
years) and contains almost double the number 
of data points (forty-four) used in any previous 
non-business study. Further, the information set 
is as complex as it can be to still be effectively 
displayed on a normal size, thirteen inch CRT sc- 
reen with the four forms of presentation used in 
the study. - 


Forms of presentation. Four forms of presen- 
tation are used in the experiment: bar charts, 
line graphs, pie charts, and tables. A potentially 
confounding factor in past research was the use 
of inappropriate or poorly-designed information 
presentations. To insure that the results of this 
study are not similarly confounded, the design 
guidelines set forth by Bertin (1983) for graphic 
presentations, and Ehrenberg (1977) for tabular 
presentations, are used to select and construct 
the information presentations.* 

Bar charts and line graphs are the two stand- 
ard forms of graphic presentation which Bertin 
identifies as appropriate for presenting the type 
of information set used in this study (i-e., time 
series data). Pie charts are included as a form of 
graphic presentation due to their extensive use 
in business publications and the fact that the 
ability to prepare and display pie charts is a com- 
mon feature of data management software. Ta- 
bles are the form of presentation which accoun- 
tants have traditionally used (Jarett, 1981) and 
the basic alternative to graphs for the visual dis- 
play of accounting information (Lusk, 1979). 


Questions. Bertin (1983) identifies the ques- 
tion to be answered as the relevant task charac- 
teristics to be controlled when examining per- 
formance with information presentations. The 
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complexity of extracting a question—answer 
from an information presentation is dependent 
upon what the user of the presentation must do 
to isolate and extract the relevant information. 
In this study the complexity of the questions is 
evaluated based on the steps which it is pre- 
dicted will be performed with the information 
cues to answer them using a method suggested 
by Davis et al. (1985). The results of their study 
indicate that the complexity of questions can be 
evaluated based on the following list of steps (or- 
dered by increasing complexity): 


1. Identifications (e.g, identifying a line on a line graph 
or row in a table), 

2. Scans (e.g, locating the highest points on a Line in a 
line graph); 

3. Comparisons (e.g, comparing two amounts or 


slopes); 
4, Estimations (e.g., estimating the approximate sum or 
difference of two numbers). 


Consistent with Bertin’s treatment of the rela- 
tive complexity of questions being invariant to 
changes in the form of presentation, the relative 
increases in question complexity resulting from 
the performance of these steps is assumed to be 
the same within all forms.of presentation. For in- 
stance, it is assumed that estimating the differ- 
ence between two cues always increases the 
complexity of a question more than comparing 
the same two cues. This does not imply that the 


difficulty of a particular step is the same across 


forms of presentation; for instance, it is not as- 
sumed that the difficulty of comparing the size of 
two bars in a bar chart is equal to that of compar- 
ing the size of two pie slices in a pie chart. 

To minimize the possibility of the subject 
realizing that he or she is repeatedly being 
shown the same five questions and information 
set, several superficial characteristics of the in- 
formation set which Bertin (1983) does not 
identify as affecting the appropriateness of a 
given form of presentation are varied across the 


‘ For graphic presentations, examples of Bertin’s (1983) guidelines include portraying quantitative and ordinal variables 
along the two dimensions of the plane (ie., the X—Y areas), identifying a categorical variable with color or patterns, and 
specifying the variables being displayed in the graph title. Examples of Ehrenberg’s (1977) guidelines for tabular 
presentations include using minimal spacing between columns and rounding numbers to the minimum number of significant 
digits practical. Further discussion of these guidelines can be found in Bertin (1983) and Ehrenberg (1977). 
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form of presentation. For example, in the title on 
the bar chart, profits are described as being for 
eleven years; in the tabular presentation, the’ 
profits are described as being for eleven months. 
Second, the questions are modified to take into 
account the differences between forms of pre- 
sentation just described. Comments by the sub- 
jects during exit interviews indicate that these 
procedures effectively minimize learning ef- 
fects. 

Pictures of the information presentations are 
shown in the Appendix.’ The experimental 
questions, from least to most complex, are given 
in Table 1. A listing of the steps for each of the 
four information presentations used in the ex- 
periment is given in Table 2. As the relative com- 
plexity of the questions increases, two things 
occur. First, the difficulty of the steps it was pre- 
dicted the subjects would perform to arrive at 
the correct question-answer increases. Second, 
the number of times it was predicted a given step 
would be performed increases. 


Dependent variables, The two dependent var- 
iables in the study are the time taken to answer 
a question and the accuracy of the question—ans- 
wer. For each question, there is only one correct 
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answer: a score of one is assigned to each cor- 
rectly answered question and a zero to each in- 
correctly answered question. The time measure 
is equal to the difference between (1) the time at 
which an information presentation and question 
become visible on the CRT screen and (2) the 
time at which the subject presses the “enter” key 
to record his/her answer. 


RESULTS 


Table 3 contains a listing of the treatment ef- 
fect means and standard deviations. Because the 
two dependent variables are significantly corre- 
lated (r = —0.21, p < 0.001) the data analysis is 
begun with multivariate analysis of variance 
(MANOVA).° Univariate F tests are then used to 


TABLE 1. Experimental questions 





How much was Company A’s profit in month 2? 

Which was the last company to attain its highest profit? 
Which company had the largest absolure difference 
between its highest and second highest profit from 
month 2 to month 4? 

. In which month did the largest absolute difference 
between the profits of company A and company B occur? 
Which company had the largest absolute decrease in 
profits from any one month to the next? 


N> 





TABLE 2. Number and types of steps predicted to be performed in answering the experimental questions 











Form of presentation 








Line graph Bar chart Pie chart Table 

Question Question Question Question 
Step A B C D E A B C D EAB CDEABCDE 
Identifies 3 4 4 3 43 4 4 3 42 4 4 3 43 4 4 3 4 
Scans o 4 8 0 00 4 8 0 00 4 8 0 00 8 G60 0 4 
Compares 0 3 3 10 5 0 3 3 10 58 0 3 3 10 50 0 3 3 10 58 
Estimates 0 0 41 19 0 0 4 11 19 1 O 4 «1:19 0 «0 4 H D9 














5 In order to avoid potentially confounding effects, color enhancements of the graphs were not used; both tabular and graphic 
presentations were displayed in a monochromatic mode. Gremillion & Jenkins (1981) state that while color “can be effective 
...much research is needed to develop frameworks to guide its use” (p. 133). DeSanctis (1984) concludes that while people 
seem to prefer it, the use of color may, in some instances, detract from the effectiveness of an information presentation. 


é Violations of MANOVA assumptions related to the homogeneity of dispersion matrices and multivariate normality are 


present in the data. Harris (1975) argues that, although the 


robustness of the MANOVA tests has not been investigated 


extensively, MANOVA should be as robust to violations of assumptions as Univariate Analysis of Variance (ANOVA) is to 
analogous assumption violations. ANOVA is robust to violations of the assumption of normally distributed responses with 
large, equal cell sizes, and can be adjusted, using the Huynh—Feldt (1976) method, for a non-symmetric variance—covariance 
matrix. As the results of the univariate F-tests in Table 4 show, analysis of the data using ANOVA with the Huynh—Feldt 


adjustment does not produce different results, 


REPORT FORMAT AND THE DECISION MAKER'S TASK. 


examine the effects of the independent variables 
on each of the performance measures. Finally, 
comparisons of treatment effect means are per- 
formed to identify the forms of presentation 


which resulted in the best performance with’ 


each question. 

The results of the MANOVA tests show that, by 
the Wilks’ Lambda criterion (F = 277, p < 
0.001), there is at least one significant effect. The 
results of the univariate F tests, presented in 
Table 4, show that both the main and interactive 
effects for the question to be answered and the 
form of presentation: are significant. This is true 
regardless of whether performance is measured 
in terms of the time required to answer the ques- 
tion or the accuracy of the question—answers. 

The significant main effects are important be- 
cause they support Bertin’s (1983) contention 
that the form of presentation and the question to 
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be answered both effect performance with an in- 
formation presentation. However, the most im- 
portant findings with respect to the objectives of 
this study are the significant interactive effects. If 
no interaction had been found the implication 
would have been that the relative efficiency and 
effectiveness of different forms of presentation 
are not affected by the question which the deci- 
sion maker wishes to answer and that one form 
of presentation was always better than the 
others. To further investigate the nature of the 
interaction and determine if the hypotheses 
should be rejected two sets of comparisons are 
performed. 
Investigation of interaction 

The objective of the comparisons is to deter- 
mine which forms of presentation were most ef- 
fective and efficient for each questión. The 


TABLE 3. Treatment effect means (standard deviations in parentheses) 











Panel A 
Performance measured in terms of the accuracy of question—answers 
Question Row 
A B Cc D E mean 
Pie chart 0.57 0.87" 0.93 0.63 0.53 0.71 
(0.50) (0.35) (0.25) (0.49) (0.51) (0.46) 
Bar chart 0.90 0.97 0.63 0.93 0.53 0.79 
(0.31) (0.18) (0.49) (0.25) (0.51) (0.41) 
Line graph 0.90 1.00 0.83 0.70 0.70 0.83 
(0.31) (0.00) (0.38) (0.47) (0.47) (0.38) 
Table 1.00 0.80 0.90 0.83 0.93 0.89 
(0.00) (0.41) (0.31) (0,38) (0.25) (0.31) 
Column mean 0.84 0.91 0.83 0.78 0.68 0.81 
(0.37) (0,29) (0.38) (0.42) (0.48) (0.45) 
Panel B 
Performance measured in terms of the seconds taken to answer the questions 
Question Row 
A B c D E ‘mean 
Pie chart 42 41 39 47 69 47 
(18) (18) (16) (19) (30) (23) 
Bar chart 19 34 65 49 87 51 
(9) (18) (29) (19) (42) (35) 
Line graph 26 23 ‘50 43 63 41 
(15) G1) (16) (18) (30) (24) 
Table 11 30 39 34 62 35 
(4) (11) (17) (13) (30) (24) 
Column mean 25 32 48 3 70 48 
(17) (16) (22) (18) (35) (23) 
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TABLE 4. Results of univariate F-tests 














Dependent variable 
Accuracy Time 
F p F p 
Main effects 
Formof 
presentation 6.63 0.002 16.79 <0.001 ` 
Question _ 7.83 <0.001 4839 <0.001 
Interaction s 
Form by 
question 3.83 0.005 1448  <0.001 








graphs shown in Figs 1 and 2 illustrate the com- 
plex nature of the interaction. Due to the nature 
of the interaction, it is not appropriate to 
examine the effects of different forms of presen- 
tation without regard for the question to be 
answered. Therefore, this analysis consists of 


Accuracy 





Question 


Fig. 1. Graph of the interactive cffect between question 
complexity and form of presentation on the accuracy of 


Seconds 





. Question 
Fig. 2. Graph of the interactive effect between question 
complexity and form of presentation on the time 
required to answer a question. 


_ specifically, 
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post boc comparisons of pairs of cell means 
using the Scheffé method. First, the effects of the 
different forms of presentation on the accuracy 
of question—answers are examined. For each 
question, the accuracy of the answers with each 
form of presentation is compared to the accu- 
racy of the answers with the other forms of pre- 
sentation. Second, the effects of the different 
forms of presentation on the time required to 
answer each question are examined in a similar 
manner. For each question, the time required to 
answer the question with each form of presenta- 
tion is compared to the time required with each 
of the other forms of presentation. The results of 
these two sets of comparisons are shown in 
Table 5. 


TABLE 5. Forms of presentation with which post boc 
comparisons indicate each question was answered most ac- 
curately and in the least amount of time (alpha = 0.05) 


Performance measure 


Question Time 


T 
L 
PT . 
T 
T 


MOO > 
ad 
= 





T = table, B = bar chart, P = pie chart, L = line graph. 


The results of the post boc comparisons show 
that the most appropriate form of presentation 


‘| — the one which resulted in the most accurate 


answers or required the least amount of time to 
answer the question — was different for diffe- 


. rent questions and support both hypothesis 1 


and hypothesis 2. For instance, the bar chart re- . 
sulted in the most accurate answers for question 
D but the table resulted in the most accurate 
answers for question E. These results are import- 
ant because they indicate that different forms of 
presentation are best for different tasks. More 
these results support Bertin’s 
(1983) contention that the appropriateness of 
different forms of presentation is a function of 
the information which the decision maker 
wishes to extract from a report. These findings 


_ also suggest an explanation of the conflicting re- 
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sults among prior studies. 

The experimental tasks used in prior studies 
varied widely. Consequently, it is likely that the 
subjects in different experiments were answer- 
ing different questions with the information pre- 
sentations they were shown. If the most approp- 
riate form of presentation is a function of the 
question to be answered, as the findings of this 
study indicate, it is not surprising that different 
forms of presentation resulted in the best per- 
formance across experiments. 

Blocher et al (1986) manipulated the form of 
presentation and the complexity of the informa- 
tion displayed. They found graphic presenta- 
tions better for low levels of task complexity and 
tabular presentations better for high levels of 
complexity. In this study, the tabular presenta- 
tion resulted in performance superior or equal 
to that of the graphic presentations for seven out 
of ten measures of performance, and in no case 
did the tabular presentation result in poorer per- 
formance than all three graphic formats. Further, 
the superiority of the tabular presentation was 
not found to be limited to any one level of com- 
plexity. Graphic forms of presentation were 
found to be better than the tabular presentation 
for only three of the six performance measures 
at the intermediate level of complexity. These 
findings suggest that tabular presentations are an 
effective and efficient form of presentation for a 
wide range of questions while graphic forms of 
presentation are appropriate only for a limited 
set of questions. In comparing these results to 
those of Blocher et al. (1986) it should be noted 
that they manipulated a different task charac- 
teristic, and their use of only two levels of task 
complexity would have prohibited the detec- 
tion of an interaction as complex as the one 
found in this study. 

Unfortunately, the graphs of the interaction 
and results of the post boc comparisons do not 
suggest any futher explanation of, or pattern to, 
the interaction which would indicate why a par- 
ticular form of presentation resulted in the best 
performance for a given question. While further 
interpretation of the interaction without addi- 
tional experimentation must be considered as 
tentative, an explanation which is consistent 
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both with the empirical results of this study and 
with prior research from other disciplines is 
suggested by the subjects’ comments in the post- 
test interview. 

During the post-test interview, the subjects 
stated that the graphic presentations were pre- 
ferred when they provided a “picture” of the 
data which could be used to perform one or 
more of the steps necessary to answer a ques- 


-tion. When a graph did not provide a visual pic- 


ture of relevant aspects of the information, the 
subjects attempted to mentally form an image 
which they could use to answer the question but 
found this a difficult and time-consuming pro- 
cess. Consequently, when a graph did not pro- 
vide a relevant visual image of the information, 
the tabular presentation was preferred. The sub- 
jects stated that they could use the “raw data” in 
the table to perform any step required to ahswer ` 
a question although greater effort was required . 
than when a graph provided relevant visual cues. 
The subjects’ comments indicate that when 
the information was presented in graphic form 
they answered the questions by processing the 
information in a holistic manner through the use 
of visual images; when the information was pre- 
sented in tabular form the subjects answered the 
questions by processing the information in an 
analytic fashion. Numerous studies have shown 
that the use of images can facilitate the encoding 
in memory and processing of information (San- 
ford, 1985). However, humans’ ability to form 
images in the absence of relevant visual cues is li- 
mited (Simon, 1978). This suggests that a par- 
ticular type of graph results in the best perform- 
ance only when it provides visual cues which 
can be used to form the images needed to ans- 
wer a question. When a graph does not provide 
relevant visual cues, performance suffers due to 
the difficulty of forming images in the absence of 
such cues and the use of an analytic approach to 
answering the question with the tabular presen- 
tation results in the best performance. This exp- 
lanation of the interaction is consistent with the 
findings of Hammond (1980), Russo (1977), 
and Aschenbrenner (1978). Hammond ob- 
served that the form of visual information dis- 
play may induce decision makers to process in- 
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formation intuitively rather than analytically, 
while Russo and Aschenbrenner found im- 
proved performance when answers were appa- 
rent from the information displayed. 


Negative correlation between performance 
measures 
Contrary to what might be expected intui- 
tively, a trade-off between the accuracy of ques- 
tion—answers and the time taken to answer a 
question was not found. The negative correla- 
tion found between the time and accuracy mea- 
sures, and a comparison of the forms of presenta- 
tion which were found to result in the best per- 
formance for each question, indicate that those 
forms of presentation that allowed a question to 
be answered in the least amount of time also gen- 
erally resulted in the most accurate answers. The 
` only significant exception to this finding was for 
_ the bar chart, question D. For question D, the bar 
chart resulted in the most accurate answers but 
also required more time than any other form of 
presentation to answer the question. 


LIMITATIONS AND FUTURE RESEARCH 
_The results of the experiment suggest that the 


most appropriate report format for a given ques- 
tion depends on the specific steps required to 


answer the question and not on the overall level ` 


of question complexity. To insure that the ques- 
tions used in this study could be rank-ordered in 
terms of their complexity, it was necessary to 
vary at least two of the underlying dimensions of 
complexity (i.e., identifications, scans, compari- 
sons, and estimates) between all pairs of ques- 
tions. Because the questions differ in multiple di- 
mensions it is not possible to test whether, con- 
sistent with the suggested explanation of the in- 
teraction, the effects found in this study are the 
result of differences between the forms of pre- 
sentation with respect to the visual cues which 
they provide for performing specific steps in the 
process of answering a question. 

Further explanation of the interaction will re- 
quire research designed to identify the types. of 
questions which result in performance differ- 
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ences across report formats and the visual cues 
provided by the different formats which cause 
the difficulty of answering the questions to vary. 
Identification of the types of question which re- 
sult in performance differences will require sys- 
tematic manipulation of the underlying dimen- 
sions of question complexity rather than the 
overall level of complexity. Guidance in iden- 
tifying the visual cues which individuals per- 
ceive and can use to answer a question when 
presented with different report formats may be 
found in the work which has been done in the 
areas of pattern recognition and imagery (see 
Sanford, 1985, for a review of this literature). 
The results of this research should provide evi- 
dence concerning the forms of presentation 
which are appropriate for different types of 
questions and an understanding of why they are 


appropriate. For example, it might be found that 


bar charts are best for questions involving com- 
parisons because the relative sizes of the bars 
correspond to the relative sizes of the amounts 
which they portray and can be used as visual 
cues to determine which of the amounts is 


largest. 


SUMMARY AND CONCLUSION 


The objective of this study was to determine if 
the most appropriate form of presentation is a 
function of the question which the decision 
maker wishes to answer with the information to 


‘be displayed. The results indicate that the most 


appropriate method of presenting financial in- 
formation is dependent on the decision maker's 
question; different forms of presentation are 
most appropriate for different questions. The 
findings also indicate that performance with a 
tabular presentation will be equal or superior to 
that with a graphic presentation for most ques- 
tions. Graphic presentations result in better per- 
formance only when they provide specific visual 
cues which aid in the answering of a question. 
When a graph does not provide relevant visual 
cues, performance is best with a tabular presen- 
tation. 
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The results of this study have several import- 
ant implications for future research. As discus- 
sed, further research is needed to determine 
when graphic forms of presentation provide vis- 
ual cues which facilitate the answering of the de- 
cision maker’s question. Future researchers 
need to be aware that the method of presenting 
financial information may have a significant im- 
pact on decision performance. The use of forms 
of presentation which are either not representa- 
tive of those used by real-world decision makers 
or are inappropriate for the task being examined 


may confound and/or limit the generalizability 


. ofthe results of studies investigating the use of fi- 


nancial information by individual decision mak- 
ers. The significant interaction found in this 
study suggests that the conflicting results of 
prior studies investigating the effects of different 
forms of presentation can be explained by the 
use of different experimental tasks. Future 
studies examining the relative advantages of dif- 
ferent forms of presentation should control for 
the effects which the question the decision 
maker wishes to answer has on performance. 
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APPENDIX 1. INFORMATION PRESENTATIONS SHOWN TO SUBJECTS 
DURING EXPERIMENT 


EXHIBIT 1. 
Line graph 


Profits 

- By company 
~ By year 

~ In dollars 


Dollars 
3 
t5) 
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Q 1976 1977 t978 1979 i980 198i 1982 1983 i964 1985 1988 


A — Year 


B --- 
C eee Which company had the Largest absolute 
decrease in profits from ony one year 


Dass to the next year? 
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EXHIBIT 2. 
Bar chart 
Profits 
~ By company 
s50 ~ By year 
— In dollars 
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C Which company had the largest absolute 
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D to the next year ? 
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EXHIBIT 3. 
Pie chart 
Market share 
~ By company 
- By yeor 
- Asa percentage of total industry sales 
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Which company had the Largest absolute 
decrease in marketshare from any one year 
to the next year ? 
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EXHIBIT 4. 
Table 


Profits 

- By company 
- By month 

- In dollars 





Which company had the largest absoluts 
dacrease in profits from any one month 
to the next month ? 
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Abstract 


Lite! formal, scientific study has been devoted to assessing the effectiveness of different reporting formats 
in presentihg accounting data for forecasting purposes. A laboratory study was conducted to compare the 
impact of num and graphical reporting formats on users’ judgment heuristics and judgment accuracy 
in forecasting financial statement information. Three different reporting formats were evaluated within a 
“learning environment”: (1) reports containing only numeric data; (2) reports containing only graphical 
data; and (3) reports combining both numeric and graphical data. Results suggest some modest support for 
the contention that graphical formats can improve the accuracy of forecast judgments. 


Determining the most effective means of pre- 
senting financial statement information to inves- 
tors, creditors, auditors, and general managers is 
of longstanding concern to accounting. The use- 
fulness of financial statements is directly depen- 
dent on the user’s ability to interpret, or men- 
tally represent, the data for a given investment or 
credit decision. Investors, for example, use in- 
come statement data to develop forecasts of in- 
come in order to predict the timing, amount, and 
uncertainties of future cash flows and earnings of 
the firm. Investment and credit decisions de- 
pend critically on the proper interpretation of fi-~ 
nancial statement information, and the accuracy 
of judgments made based on that information 
(Foster, 1986). 

Little formal, scientific study_has been de- 
voted, however, to assessing the most effective 


formats of presenting accounting data for finan- 
cial decisions. Numeric reports, or tables, are the 
traditional method of data presentation in ba- 
lance sheets, income statements, and other state- 
ments of the firm’s financial position (Leivian, 
1980; Sias, 1970). Past research on alternative 
reporting methods has focused almost solely on 
novel presentation formats such as Chernoff 
faces (Stock & Watson, 1984; MacKay & Villar- 
real, 1987; Moriarity, 1979). This study, on the 
other hand, examines conventional 2-dimen- 
sional graphs that have already become com- 
monplace in presenting accounting data for fi- 
nancial decisions (Johnson et ai, 1980; DuPree 
et al, 1987; Wolfe & Viator, 1987). 
Specificially, this study examines the potential 
of 2-dimensional bar graphs to improve the 
accuracy of forecasting of financial statement 


*This project was funded by the Graduate School of the University of Minnesota. 
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information. The study attempts to determine 


whether a performance advantage can be de- | 


monstrated in a setting that has high a priori ex- 
pectation of benefiting from graphical displays. 
In addition, this study explores the underlying 


cognitive -processes associated with using’. 


4 


graphical, as opposed to numeric, reporting | 


methods. The importance of examining cogni- 
tive decision processes, as well as outcomes, has 
been repeatedly noted in the literature on 
graphics (Cleveland & McGill, 1987; Jarvenpaa, 
1989). | 

We begin by presenting the research within 
the lens model, and then position the study 


within the literature on presentation formats for’ 


accounting data. Seven experimental hypoth- 
eses are developed, followed by a summary of 
the research methodology and results. Implica- 
tions for the use of graphics in presenting ac- 
counting data are given. 


RESEARCH ISSUES AND HYPOTHESES 


Conceptual framework 

Brunswik’s lens model supports the notion 
that a report format can improve human judg- 
ments (Libby & Lewis, 1982). According to the 
lens model, the accuracy of human judgments is 
partially determined by the extent of which (1) 
the individual accurately detects the properties 
of data and (2) incorporates these properties in 
judgments. To the extent that graphical reports 
improve the detection and consistent use of data 
on relevant variables, the use of graphical re- 
ports as supplements to or substitutes for tradi- 
tional numeric reports should improve judg- 
ment accuracy. Proponents of graphics argue 
that the provision of a graphical dimension to 
quantitative data permits accurate judgments 
about data to be made “effortlessly and almost 


instantaneously” (see Cleveland, 1985, p. 231). 


The geometrical aspects of graphical elements, 
such as position and size, are presumed to facili- 
tate rapid assimilation of data by the user. For 
this reason, the potential for graphics to improve 
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interpretation of accounting data is receiving in- 
creasing interest. 


Available research evidence 

Little systematic study has been devoted to as- 
sessing the effectiveness of graphics in present- 
ing accounting data in decision making contexts. 
Moriarity (1979) and Stock & Watson (1984) 


-have demonstrated the viability of using Cher- 


noff faces to support bankruptcy prediction de- 
cisions. But little research has examined the de- 
cision support capability of more conventional 
2-dimensional graphs, such as pie, bar, or line 
charts, for accounting applications. An excep- 
tion is the work of Wright (1983, 1988, 1989) 
which has compared the usefulness of scat- 
terplots with numeric reports in aiding decision 
makers to detect covariation in data sets. 
Wright’s work represents one of the few at- 
tempts in the accounting literature to compare 
the relative advantage of 2-dimensional graphs 
with tables as a reporting method. He has de- 
monstrated the performance advantage of 
graphs for correlation detection in both content- 
free situations (e.g. a simple listing of numeric 
pairs) using student subjects (Wright, 1983), as 
well as context-laden settings (e.g. loan approval 
decisions) by professional accountants (Wright, 
1988). Wright’s work clearly implies the poten- 
tial power of graphs as decision aids for account- 
ing applications. Moreover, his research 
suggests that graphs encourage different cogni- 
tive responses in people than do numeric re- 
ports, thus leading to differences in data in- 
terpretation. 

The current study extends the work of Wright 
to corisider a more complex task than the accu- _ 
racy of correlation judgments — “a relatively 
simple judgement task” (Wright & Anderson, 
1988, p. 11). This study follows guidelines pre- 
sented by Jarett (1983) and develops bar graphs 
to support forecasting of financial statement in- 
formation. The aim of the study is to empirically 
assess the impact of graphs on users’ judgment 
heuristics and resulting judgment quality within 
a forecasting setting. While prior studies of finan- 
cial statement presentation have compared 
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human judgments using graphical displays to 


mathematical models applied to the same data ` 


sets (Moriarity, 1979; Stock & Watson, 1984; 
Altman, 1983), this study compares human judg- 
ments based on graphical reports with judg- 
ments based on either numeric or a combination 
of graphical and numeric reports. Thoughtful 
guidelines for developing graphs for accounting 
reports have been developed by Jarett (1983) 


and Anderson (1983). However, controlled ex- . 
periments and measurement can provide knowl- 


edge of the true potential for graphs to serve as 
substitutes or supplements to standard numeric 
reports. 

This study also incorporates a “learning 
environment” in its comparison of alternative 
presentation formats for accounting data. Most 
prior studies comparing graphical and numeric 
reports have measured decision makers’ per- 
formance with the reports ofa single trial, often 


finding little performance advantage for: 


graphics (see DeSanctis, 1984; Jarvenpaa & 
Dickson, 1988; Ives, 1982). ‘Since accounting 
data is generally provided in a numeric format, 
we might anticipate that graphical formatting 
may initially be difficult for users to com- 
prehend. A performance advantage for graphics 
may not appear on a single observation or ex- 
perimental trial. On the other hand, practice in 
viewing graphs might improve their meaningful- 
ness, and over time, a performance advantage of 
graphs in contrast to tables may be observed. To 
test the importance of learning in using graphi- 
cal reports, individuals in this study are given the 
opportunity to practice using graphical mater- 
ials and are provided with feedback on their 
forecast accuracy. 


Research hypotheses 

Seven hypotheses are developed. While 
hypotheses 1 and 2 directly examine the effects 
of graphs on forecast accuracy, hypotheses 3 and 
4 indirectly investigate forecast accuracy in 
terms of judgment errors. Hypothesis 5 
examines the influence of graphs on subjective 
evaluations of confidence in forecasts. Hypoth- 
esės 6 and 7 address the issue of practice in the 
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use of graphics for forecasting of financial state- ` 
ment information. 


Forecast accuracy and graphics 


Hypothesis 1. Forecast accuracy will be better when data 
is displayed in a-graphical format than when data is dis- 
- played ina numeric format. ‘ 
Hypothesis 2. Forecast accuracy will be better when data 
is displayed in a combined graphical and numeric format 
than when data is displayed in a purely graphical format. 


Forecasting is a problem setting characterized 
by strong a priori expectation that graphs can 
facilitate accurate judgments (Willis, 1987). 
Graphs are known to be particularly useful in 
tasks requiring detection of trends and relation- 
ships between data values (Dickson, DeSanctis, 
& McBride, 1986; Jarvenpaa, 1989; Wright, 
1983, 1989). These two properties of data are 
essential in forecasting (Einhorn & Hogarth, 
1982). For these reasons, forecasting represents 


` a task with a bias towards graphs. A performance 


advantage for graphs vis-a-vis tables is expected 
(hypothesis 1). Furthermore, since the forecast- 
ing of financial statement information requires 
specific point value estimation of future data, 
providing the forecaster with both graphical and 
numeric data about historical data should 
maximize the accuracy of predictions (Jarett, 
1983) (hypothesis 2). . 


Cognitive biases and graphics 


Hypothesis 3. Recency bias, or the tendency to overem- 
phasize recent data to the exclusion of historical data, 
will be less likely when a graphical format is used than 
when a numeric format is used. 

Hypothesis 4. Over-reliance on a single cue (such as earn- 
ings per share) to the exclusion of other relevant cues 
(such as revenues, expenses, and net income) in the de- 
velopment of forecasts will be less likely when a graphi- 
cal format is used than when a numeric format is used. 


Success in forecasting is known to require for- 
ward and backward inference with respect to 
data (Einhorn & Hogarth, 1982). Graph . 
theorists argue that where a large number of his- 
torical data points is available, graphs have the 
advantage over tables of facilitating summariza- 
tion of data and detection of the underlying data 
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model (Cleveland, 1985; Tufte, 1983). If this is numeric format should make this format the 
true, then common compensatory heuristics as-. most confidence-inspiring (see Benbasat & Dex- 


sociated with judgment errors, such as overem- ter, 1985). 
phasis on very recent data points or over re- 
-liance on certain cues to the disregard of others Practice effects of graphics 


(Wright, 1980), should be less likely if data is dis- 
played graphically (hypotheses 3 and 4). Hypothesis 6. Practice in using graphs for forecasting will 
be required before a performance advantage for this re- 
porting format willbe observed. 

Hypothesis 7. Given moderate practice in forecasting, 
different learning rates will occur among users of graphs, 
users of numeric reports, and users of combined graphi- 
cal/numeric reports. ` 


Confidence effect of graphics 


Hypothesis 5. User confidence with accounting reports 
will be higbest when data is provided in a combined for- 
mat, and higher with a graphical format than with a 
numeric format. 

=S While forecasters are expected to benefit 
Fischoff & MacGregor (1982) have noted that from the use of graphics, the performance advan- 
there is frequently a relationship between judg- tage might occur only after some practice in 
ment accuracy and confidence. Wright (1983) using graphs. Several researchers who have been 
found higher confidence in decision makers. unable to demonstrate performance advantages 
using graphs than those using numeric reports; for graphics have hypothesized that practice 


people making more accurate judgments were 
more confident relative to those with poorer 
quality judgments. Importantly, however, some 
studies report no significant relationship be- 
tween performance on a task and confidence. 
For example, Chervany & Dickson (1974) found 


with using graphs might improve decision mak- 
ers’ ability to use them (Lucas & Nielsen, 1980; 
Lusk & Kersnick, 1979; Powers et al, 1982; Wat- 
son & Driver, 1983). The argument here is that 
there exists a “conditioning bond” toward tables 
— particularly in business problem settings — 


that decision makers given aggregated data and that this bond takes time to break down. 
made higher quality decisions than those receiv- Using a similar rationale, Brandon and Jarrett 
ing the same data in the standard detailed for- (1977), in research concerned with introducing 
mats, but had less confidence in the quality of, novel information into standard accounting re- 
their decisions. Others have found that decision ports, concluded that even students with little 
makers tend to be overconfident about the accu- business experience “are unfamiliar with for- 
racy of their judgments (see Einhorn, 1980; Fis- mats that deviate from the traditional statement 
‘chhoff, Slovic, & Lichtenstein, 1977; Oskamp, forms and tend to disregard much of the data 
1965). Consequently, the relationship between content” (p. 701). The implication is that if 
decision performance and confidence isnot en- graphics are to have a positive impact on judg- 
tirely clear. However, if a graphical format is in ment accuracy, the user may have to have prac- 
fact more appropriate for performance the fore- tice in using the reports so that the “novelty” of 
casting task than a table, we might anticipate that the format can wear off (hypothesis 6).' 

users provided with-graphs or with combined However, mere practice alone may not be suf- 
reports will be more confident in their forecasts ficient to yield an observed performance advan- 
than those provided with numeric reports. tage of graphs over tables. In a study comparing 
While the spatial characteristic of graphs should aggregated reports with detailed reports, Odey 
encourage greater confidence with the graphi- & Dias (1982) found that practice with feed- 
cal reports over tables, the precision of exact back enabled users of aggregated financial re- 
numbers provided in the combined graphical/ ports to improve their judgment accuracy and 





‘in a pilot study with only one experimental trial, graphical reports were not found to be better than tables in this forecasting 
task. 
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reduce their decision time. Moreover, users of 
aggregated reports required less time to assimi- 
late and make judgments. To the extent that 
graphs have a summarizing or aggregating effect, 
judgment accuracy may improve using this for- 
mat, provided that users are given practice at 
using them and feedback on the quality of the 
judgments that they make based on the reports. 
The current study aims to determine whether 
practice with feedback will, in fact, yield 
superior performance in a task that should bene- 
fit from a graphical display of data. 

Also, learning rates for numeric and graphical 
groups should be different if graphs are in fact 
the desirable format for displaying accounting 
data for the purposes of financial forecasting. 
Mock et al. (1972) point out that the “structure 
of data” can be varied by altering the timing of 
data, degree of detail, or format. The “structure 
of data,” in turn, influences the rate of learning. 
These learning rates are manifested in differ- 
ences in slopes of performance curves over time. 
To the extent that there are differences in the 
data structure between numeric and graphical 
data representations, learning rates for decision 
makers receiving graphs can be expected to be 
different from those using tables (hypothesis 7). 

As a final point, the duration of practice may 
affect the relative value of one report format 
over another. Assuming there is a strong prior 
expectation that graphics will improve judg- 
ment accuracy, as in the case of forecasting, then 
moderate practice should yield observable evi- 
dence in favor of graphics. However, if practice 
is extended over a very lengthy period of time, 
then the relative advantage of graphs may in fact 
fade. In one empirical study, Christ (1983) de- 
monstrated that nine months of practice with 
four types of symbolic representation formats, 
including numbers, letters, geometric shapes, 
and dots, led to no clear and consistent advan- 
tages for any one visual format over another. As 
Bettman & Zins (1979) point out, if given suffi- 
cient time people may adapt format to task. The 
current study is confined to consider the effects 
of moderate practice with feedback in the hopes 
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of demonstrating a performance advantage for 
graphs within this time frame. 


METHODOLOGY 


Using a simulation environment, an experi- 
ment was conducted to compare the effects of 
(1) numeric, (2) graphical, and (3) combined 
graphical/numeric reports on the ability of users 
to forecast income statement information. An in- 
come statement was chosen for examination be- 
cause Chang et al. (1983) have found that the in- 
come statement is the item with the most consis- 
tently high ranking of importance in investment 
and credit decisions. A laboratory setting was 
chosen because of the difficulty of isolating the 
variables of interest within a field setting. Also, 
there is some precedent for the use of laboratory 
simulation to study the impact of presentation 
format on forecast accuracy (Benjamin & 
Strawser, 1974; Hofstedt, 1972). In order to de- 
tect potential learning effects associated with 
using the three types of reports, repeated obser- 
vations were made on each experimental sub- 
ject. 


The experimental task 

The experimental task was developed based 
on the work of Brandon & Jarrett (1977, 1979), 
Benjamin & Strawser (1974), and Pratt (1982), 
all of whom have studied display formats for fi- 
nancial statements. ` 

A two page case write up and a set of financial 
reports for five companies were constructed for 
the experimental task? The case scenario read as 
follows: 


You are the Chief Finance Officer at High Tech Systems 
Corporation ... Your boss has asked you to assess the fi- 
nancial status of 5 competitor firms. The internal opera- 
` tions of these firms are quite equivalent . . . They have all 
been in operation for 16 years. You are to read the finan- 
cial reports for each firm, and in each case make a forecast 
: of the firm's earnings per share for the years 1985 
through 1987 ... In order to do that, you first must esti- 
mate revenues, expenses, and net income... When mak- 


2Copies.of the experimental materials can be obtained from the authors. 
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ing your forecasts, consider only the financial reports you 
are given. Bringing other knowledge about the market, 
the economy or the industry will lower the quality of 
your forecasts . . . Success in this task requires learning. A 
good strategy is first to develop a method ... that works 
for one company and then apply this method when fore- 
casting for subsequent companies, refining the method 
each time until you have developed a generally applica- 

ble method for making your forecasts. 


The case also defined all of the accounting and fi- 
nancial terms that subjects encountered in the 
task. 

The forecasting task required subjects to de- 
tect the trends in sales revenue and other ex- 
penses and project these trends as a form of 
point values into the future. In addition to trend 
spotting, the task required subjects to form judg- 
ments about the relationship between revenue 
and cost of sales, the relationship between ex- 
penses and earnings, and the relationship be- 
tween these income statement items and earn- 
ings per share (EPS). 

The case scenario encouraged the subjects to 
develop and refine a method for making their 
forecasts as they progressed through the five 
companies. These instructions were given for 
two reasons. First, a goal of the study was to as- 
sess the cognitive biases associated with 
numeric and graphical reports, in addition to 
measuring forecast accuracy. We wanted to sen- 
sitize the subjects to the approach they used in 
solving the task because later, after making fore- 
casts, they were asked to articulate the approach 
they used to develop their forecasts. Second, the 
study aimed to create conditions favorable to 
learning. We thus informed subjects that the task 
demands of each company were equivalent. The 
instructions deliberately encouraged the sub- 
jects to try to improve their performance on the 
task over time. In this way, practice, which was 
an independent variable in the study, became a 
more meaningful manipulation. 

Following Brandon & Jarett (1977), a conven- 
tional Monte Carlo simulation was used to pre- 
pare historical and future incOme statement and 
earnings per share data for five fictitious com- 
panies. Sales volume was extrapolated using a 
linear time-series containing an error term. The 
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error term was calculated by multiplying a nor- 
mal random number for each year by the stand- 
ard deviation. The sales volume together with 
the random variable for each of the years was 
then used as the basis for calculating sales re- 
venue and the cost of sales. A second time-series 
containing a different error term was used to 
generate dollar values for other expenses. Net 
income and EPS were computed by subtracting 
cost of sales and other ‘expenses from sales re- 
venue, and then dividing by the total number of 
corporate shares. In this manner, a complete set 
of income statement data for a company was 
generated over a 19 year period. A distribution 
of values was generated by the simulation for 
each historical period. The initial 16 years com- 
prised historical data while the final three years 
comprised future data. The mean of the distribu- 
tion, or expected value, for each of the final 
three years of data provided a standard against 
which the subjects’ forecasts could be com- 
pared; these values were also used after each ex- 
perimental trial to provide subjects with feed- 
back on the quality of their forecasts. 

The simulation was run five times, once for 
each of the five fictitious companies. To assure 
equivalency across the firms, the normal random 
error terms and relationships among variables in 
the simulation-model were held constant across 
all companies. This meant that all systematic ex- 
ternal variation stayed constant. Only the initial 
values for the first period sales and expenses, as 
well as the number of stockholder shares, and 
the price and cost charged per unit, were mod- 
ified from one simulation run to another, and 
thus varied from one company to another. The 
reports from each of the five firms constituted _ 
five experimental trials, and the order of presen- 
tation of the reports was randomized across sub- 
jects. As a check on the equivalency of the five 
firms, #-tests for paired comparisons were con- 
ducted on subjects’ forecast accuracy for all 
combinations of the five companies. No signifi- 
cant differences in performance as a function of 
the company examined were observed. 


Report formats and practice 
The experimental manipulations were (1) the 
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format used to display the earnings data and (2) 
practice in viewing a particular format. A stand- 
ard spreadsheet format was used in the 
“numeric” condition. Horizontal bar charts 
were used in the “graphical” condition, and hori- 
zontal bar charts with specific dollar values 
printed at the end of each bar were used in the 
“combined” graphical/numeric condition. Every 
effort was made to equate the graphical and 
numeric reports in terms of their information 
content. The graphical reports were simply a 
mode transformation of the numeric reports. Six 
graphs were always required for one numeric re- 
port on the five companies. The graphs were 
clipped together (not stapled) so that subjects 
could lay them out and examine them together. 
Most of the subjects tended to lay the graphs out, 
either side by side or above/below one another, 
and many of the subjects drew trend lines or 
other notes on the graphs as they worked. 

All graphs were prepared according to guide- 
lines proposed by Jarett (1983) for displaying fi- 
nancial statement data. In order to assure that 
Jarett’s guidelines were properly applied, and as 
a check on the overall quality of the graphs, a 
graphics artist was consulted to review the 
graphs that were used in the study. The artist was 
blind to the research hypotheses. Ives (1982) 
advises researchers who study business graphics 
to consult with graphics artists so that graphs 
can be tested at their best. If nonsignificant ef- 
fects occur in the study, the results cannot be at- 
tributed to poor quality graphs. 

Horizontal bar charts were selected as the 
graphical format because they are appropriate 
for detecting trends, relationships, and indi- 
vidual point values in data. The forecasting task 
required subjects to, first, detect the trends in 
sales revenue and other expenses and project 
these trends as a form of point values into the fu- 
ture. Simple trend lines, plots, or vertical bar 
charts also would facilitate this trend spotting as- 
pect of the task. However, horizontal bar charts 
were used because, in addition to trend spotting, 
the task required subjects to form judgments 
about the relationship between revenue and 
cost of sales, the relationship between expenses 
and earnings, and the relationship between 


515 


those income statement items and EPS. Jarett 
(1983) and others (Schmid & Schmid, 1979) re- 
commend the use of horizontal bar charts when 
the relationship between data points is ofimpor- 
tance to the judgment task. Furthermore, since 
the task required subjects to detect and project 
individual point values, bar charts were selected 
over trend lines because bar charts are consi- 
dered superior to line graphs for depicting indi- 
vidual point values (Kosslyn, 1985). 

The practice variable was operationalized by 
exposing subjects to five experimental trials. 
Each trial took 15 minutes to complete. Our in- 
terest was in comparing subjects’ forecast accu- 
racy prior to practice (i.e. the first experimental 
trial) with their forecast accuracy following” 
moderate practice (i.e. the fifth trial). To avoid 
order effects, the order of presentation of the fic- 
titious companies was randomized across sub- 
jects. For each trial, subjects read the historical 
income statement reports for a company and 
then developed forecasts of revenue, expenses, 
net income, and EPS for three years into the fu- 
ture. The subjects were given feedback on the 
quality of their forecasts in terms of the “cor- 
rect” future values before going onto the next 
experimental trial. The feedback reports were 
designed in the same manner as the other re- 
ports in the study. The numeric groups received 
feedback in a tabular form. The graphical and 
combined groups both received feedback with 
horizonal bars that were labelled with numeric 
information. 

These conditions aimed to simulate, in a con- 
densed time frame, a situation where a forecas- 
ter might adjust the approach used to develop- 
ing forecasts after observing the accuracy of his 
or her projections as time unfolds. The experi- 
mental environment allowed us to speed up that 
learning process. Note that in prior studies of 


` graphics, subjects were given either a single trial 


with no feedback or a maximum of 30 minutes of 
practice with feedback (cf. Benbasat & Dexter, 
1985; Lucas & Nielsen, 1980; Lusk & Kersnick, 
1979; Watson & Driver, 1983). None of these 
studies was able to demonstrate any perform- 
ance advantage for graphs over tables. Inthe cur- . 
rent study feedback was given on ail five experi- 
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mental trials in order to encourage the potential 
performance advantage of graphical formats to 
emerge. 

The conditions of the study were artificial, but 
it is important to point out that the goal of the 
study was more theoretical than practical. Our 

_ interest was in testing decision makers’ capacity 
to use graphical reports, not in replicating the 
real world of forecasting financial statement in- 
formation. 


Dependent measures 

The average percent forecast error was used 
as a measure of the subject’s judgment accuracy. 
This measure was selected to keep the research 
consistent with the work on which it builds 
(Brandon & Jarrett, 1977, 1979; Pratt, 1982). To 
determine forecast accuracy for each trial, the 
subject’s forecasts of earnings per share for three 
future periods were compared to the expected 
values generated by the Monte Carlo model 
(“Accurate EPS”) in the following manner: 


Forecast error = 


3 
E {C Forecast of EPS — Accurate EPS I 
n=1 ccurate 


Average forecast error = (Forecast error) /3 


Average percent forecast error = 
Average forecast error * 100. 


Decision speed was treated as a control vari- 
able, rather than a dependent measure. This was 
because in a pilot study with no time restrictions 
we found that subjects in the graphical condi- 
tion took an average of 45 minutes, and as long as 


70 minutes, to perform the task. The majority of : 


the time was spent on. converting the bar chart 
` data into precise numbers and then performing 
calculations on the numeric data. When subjects 
were provided with time to convert the graphs 
into tables, the graphical manipulation of the 
study was lost. Therefore, in order to pressure 
subjects into using the visual dimension of the 
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reports, we limited the time allowed for each 
trial to 15 minutes. 

The Monto Carlo model dd all 16 historical 
data points to project into the future, placing 
progressively greater weight on more recent 
data. To assess extreme recency bias, forecast 
error was calculated for each subject based on 
forecast models that included 16, 8, 5, and 3 
periods of historical data respectively. The 
model that produced the lowest forecast error 
for the subject was identified as reflecting the 
number of years of historical data used by the 
subject. If the subject’s forecasting model was 
based on very few historical periods, such as a 
model considering only three historical periods, 
this was considered to be evidence of extreme 
recency bias. That is, if the subject’s forecast was 
based only on the three most recent data 
periods, as opposed to including data from any 
earlier historical periods, then the subject over- 
emphasized recent data, to the exclusion of his- 
torical data. 

After each trial, subjects were asked to briefly 
describe in writing “the approach you used to 
formulate your forecast of EPS (earnings per 
share)”. This was done to capture cognitive 
heuristics used by the subjects. The retrospec- 
tive protocol data were later scored for evidence 
of reliance on cues (i.e. over-reliance on a single 
cue to the exclusion of other relevant cues), An 
independent rater described and categorized 
the approach followed by each subject in de- 
veloping forecasts of EPS. The rater was familiar 
with the forecasting task but blind to the specific 
purpose of the study and hypotheses. The rater 
was given specific rules for classifying the strate- 
gies used by subjects into one of three categories 
(see the Results section). 

After the first and last experimental trials (and 
prior to seeing feedback), subjects were asked to 
rate the degree of their confidence in their fore- 
casts on a seven-point Likert scale. Our interest 
was in detecting overall confidence change dur- 
ing the experiment, not changes from trial to 
trial. 


Subjects 
Forty-eight MBA students, all in the second 
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year of their graduate program, participated in 
the study. All of the students had completed at 
least one course each in accounting and finance 
at the graduate level. Approximately 1/3 of the, 
sample were employed full-time, and most of the 
students worked at least part-time. On average, 
the students were 29 years old and had 4.4 years 
of full-time work experience in a business or ad- 
ministrative position. 


Procedure 

The experiment required two hours of the 
subject’s time. Data was collected in small 
groups of four to seven students each, over a four 
week period. Prior to performing the experi- 
mental tasks, subjects completed a research par- 
ticipation consent form and an agreement to 
keep the nature of the study confidential. As a 
performance incentive, subjects were informed 
that prize money of $50, $35, $25, and $10 
would be awarded to the top four performers on 
the forecasting task. l 

Subjects were randomly assigned to one of the 
three report conditions: (1) numeric; (2) 
graphical; or (3) combined. All subjects per- 
formed the same tasks. Only the format of the 
historical and feedback reports varied across 
groups. For each of the five trials, the subjects 
read the reports for a company and recorded 
forecasts of revenues, cost of sales, expenses, net 
income, and EPS for three years into the future. 
Although they were asked to develop forecasts 
for each of these five variables, subjects were 
told that their forecast of EPS was most import- 
ant and that prize money would be awarded 
based on the quality of EPS forecasts only. Al- 
ways after recording their forecasts, subjects 
were asked to describe in writing the approach 
they used to formulate their forecasts of EPS. At 
the end of each trial, subjects were provided 
with feedback on the accuracy of their judg- 
ments by showing them the values generated by 
the Monte Carlo model. The numeric group re- 
ceived feedback in the format ofa table. Both the 
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graphical and combined groups received feed- 
back in the format of bar charts with precise 
numbers at the end of each bar. 

Subjects were not permitted to use cal- 
culators or computers in developing their fore- 
casts. This constraint, although reducing the 
realism of the experimental setting, allowed the 
data presentation format to be a pure manipula- 
tion. Our interest was in forcing the subjects to 
rely exclusively on the reports so that other de- 
cision supporting technologies would not havé a 
contaminating influence on the dependent mea- 
sures. i 


RESULTS 


Summary statistics for the average percent 
forecast error for the three experimental groups 
in each of the five trials are shown in Table 1. 
Plots of these same statistics are given in Fig. 1. 
Smaller scores correspond to better forecast ac- 
curacy (ie. lower average percent forecast 
error). Because the homogeneity of variance as- 
sumption necessary for analysis of variance pro- 
cedures was not met, the raw values of forecast 
error were converted to a logarithmic scale and 
all subsequent analyses performed on these 
transformed scores.* Table 2 shows transforma- 
tions on the values in Table 1. Simple observa- 


Average percent forecast error 





Trial 


Fig. 1. Mean forecast error of three groups across five trials. 


Note that scores based on percentages often result in failure to meet the homogeneity assumption necessary for ANOVA 


(Winer, 1971). 
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‘TABLE 1. Means and standard deviations for forecast error of three treatment groups across five trials (based on raw scores) 























tion of the mean scores suggests that minimal 
improvement in performance occurred in the 
numeric group. On the other hand, the graphical 
and combined graphical/numeric groups gradu- 
ally improved in the quality of their performance 
over time, and there appears to be a slight “learn- 
ing curve” in these two groups. The combined 
group performed somewhat better than the 
graphical groups across the five trials, but the 
shapes of the curves for these two groups were 
similar. 


Forecast accuracy and grapbics 

Forecast accuracy was better in the graphical 
group than the numeric group. The combined 
group had the highest forecast accuracy of the 
three groups. Following the approach suggested 
by Winer (1971), a multivariate analysis of var- 
iance for repeated measures was performed (see 
Table 3). Results suggest that the overall per- 
formance of subjects across these three groups 
was significantly different ifa 0.10 level of signifi- 
cance is applied for rejecting the null hypo- 


Group Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Overall 
Mean Mean Mean Mean Mean Mean 
(S.D.) (S.D.) (S.D.) (S.D.) (SD.) (S.D.) 
` Numeric 18.75 11.57 18.10 20.18 15.23 16.77 
(12.7) (7.9) (27.2) (25.9) (7.8) (16.3) 
Graphical 18.21 15.36 12.06 12.03 11.01 13.73 
(14.3) (9.4) (6.6) (9.1) (8.7) (9.6) 
Combined 20.01 10.74 10.09 9.03 9.06 11.79 
(19.5) (9.1) (6.4) (5.1) (9.5) (9.9) 
TABLE 2. Means and standard deviations for forecast error of three treatment groups across five trials 
(based on transformed scores) 
Group Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Overall 
Mean Mean Mean Mean Mean Mean 
(S.D.) (S.D.) (SD.) (S.D.) (S.D.) (SD.) 
Numeric 1.143 0.973 1.018 1.125 1.132 1.078 
(0.38) (0.29) (0.42) (0.37) (0.22) (0.34) 
Graphical 1.125 1.108 1.010 0.872 0.929; 1.009 
(0.37) (0.27). (0.27) (0.34) (0.32) (0.31) 
Combined 1.127 0.902 0.935 0.960 0.803 0.945 
(0.41) (0.37) (0.26) (0.30) (0.36) (0.34) 


thesis. If a more strict 0.05 level is applied, how- 
ever, the null hypothesis cannot be rejected. We 
can conclude that the results suggest moderate, 
but not strong, support for hypotheses 1 and 2. 


Cognitive biases and graphics 

To assess recency bias, forecast error was cal- 
culated for each subject based on forecast mod- 
els that included 16, 8, 5, and 3 periods of histor- 
ical data. The model that produced the lowest 
forecast error for the subject was identified as re- 
flecting the number of years of historical data 
used by that subject in performing the forecast- 
ing task. In this way, subjects in each experimen- 
tal conditions were classified as relying on 16, 8, 
5, and 3 periods of historical data in making their 
forecasts. Table 4 displays the results of this 
analysis. l 

Subjects across all three conditions appeared 
to employ models with less than 16 periods of 
historical data. While the tendency to prefer 
more recent data over historical data may be ap- 
propriate in a forecasting task, over reliance on 
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TABLE 3. Summary of multivariate analysis of variance for tests of group, trial, and interac- 





tion effects on forecast accuracy 
Source of variation SS. df F Significance ofF 
Between subjects 
Groups 0.993 2 2.825° 0.069 
Subjects within groups 7.429 45 
Within subjects 
Trial 0.803 4,42 2.572° 0.051 
Group * trial 0.807 8,84 1.187 0.317 





TABLE 4. Evidence of recency bias based on number of historical periods associated with lowest forecast 











error 
Number of historical Count Group 
periods associated with (column % ) 
lowest forecast error (row % ) Numeric Graphical Combined 
16 Periods 0 1 o 
(0) (6.3) (0) 
(0) (100) (0) 
8 Periods 3 ' 3 5 
(18.8) (18.8) (31.3) 
(27.3) (27.3) (45.5) 
5 Periods 4 7 : 5 
(25.0) (43.8) (31.3) 
(25.0) (43.8) (31.3) 
3 Periods 9 5 6 
(56.3) (31.3) (37.5) 
(45.0) (25.0) (30.0) 


Note; the chi square statistic for the test of a relationship between group and number of historical periods 
used by the subject to develop forecasts is equal to 4.903 df = 6, significance = 0.56. 


extremely recent data, to the exclusion of histor- 
ical trends, can be counterproductive. The 
group using numeric reports was most likely to 
use only three historical data points when fore- 
casting, while those with graphical reports, in 
general, included at least five historical periods 
in their forecasting models. Reliance on only 
three periods of data indicates an extreme re- 
cency bias. The relationship between experi- 
mental condition and historical periods 
employed in forecasting was not significant. 
While the numeric groups tended to be more ex- 
treme in its recency bias, this bias was not sig- 
nificantly lower in either of the graphical groups 
relative to the numeric group. Therefore, hypo- 
thesis 3 is not supported. 

To assess subjects’ reliance on cues in de- 


veloping their forecasts, retrospective protocols 
were examined for the approach followed in de- 
veloping forecasts of EPS. An independent rater 
was given a set of guidelines for classifying sub- 
jects into one of three approaches to the fore- 
casting problem: 

(1) Revenue, cost of sales, and other expenses 
were projected by examining trends in these val- 
ues. EPS was derived from these values based on 
an estimate of the number of outstanding shares 
and the values of net income. 

(2) EPS was determined by simply examining 
the historical trend in EPS and projecting this 
trend into the future (revenue, cost of sales, 
other expenses, and net income were ignored in 
the analysis). i 

(3) Net income was estimated by examining 


520 


the historical trend in this value and projecting 
this trend into the future. EPS was derived by es- 
timating the number of outstanding shares and 
dividing this into the estimate of net income. 

The first approach is considered the “best” in 
that it is the most appropriate for the task. The 
first approach most closely approximates the 
logic of the Monto Carlo simulation and reflects 


the use of all relevant task cues by the decision: 


maker. The second strategy is least desirable in 
that it reflects over reliance on a single cue, 
namely EPS. The third strategy is intermediate in 
cognitive bias in that it reflects use of more than 
one cue but failure to incorporate all cues. Rat- 
ings of protocols were made for each trial and 
then combined as a measure of the subject’s 
overall approach to the task throughout the ex- 
periment. 

The results suggest little support for hypo- 
thesis 4 (see Table 5). Although the graphical 
group was most likely to use strategy No. 1 and 
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least likely to use strategy No. 2, and the numeric 
groups was most likely to use strategy No. 2, the 
differences in the observed frequencies across 
groups were not significant. 


Confidence effects of graphics 

Subjects’ ratings of decision confidence were 
taken after the first and last trials (see Table 6). 
The numeric group expressed greater confi- 
dence than either the graphical or combined 
conditions at the beginning of the study; in fact, 
the confidence of the numeric group was sig- 
nificantly greater in the numeric group than in 
the graphical group (t = 2.22,.d f = 29, p = 
0.04). As the experiment progressed, the 
numeric group’s confidence lowered while the 
graphical and combined groups became more 
confident. While confidence was highest in the 
combined group, these ratings were not signific- 
antly higher than in the numeric group. Perhaps 
most notable is the failure of the graphical group 


TABLE 5. Evidence of reliance on cues based on strategy used to determine EPS 








Count Group 
(row % ) 
Strategy (column % ) Numeric Graphical Combined 
(1) Each variable projected, 7 10 6 
then EPS derived (43.8) (62.5) (37.5) 
(30.4) (43.5) (26.1) 
(2) EPS forecasted based only on 5 1 4 
historical EPS data (31.3) (6.3) (25.0) 
(50.0) (10.0) (40.0) 
(3) NI forecasted based only on 1 3 4 
NI data, then EPS derived (6.3) (18.8) (25.0) 
(12.5) (37.5) (50.0) 
(4) No systematic strategy could 3 2 2 
be determined by the rater ` (18.8) (12.5) (12.5) 
(42.9 (28.6) (28.6) 





Note: the chi square statistic for the test of a relationship between group and strategy is equal to 5.766, 


af = 6, significance = 0.45. 


TABLE 6. Mean ratings and analyses of vartance of decision confidence for each group after the first and last trials 








Trial Group Analysis of variance 

Numeric Graphical Combined af i F Significance of F 
Trial 1 4.19 3.07 3.09 2,39 3.13° 0.05 
Trial 5 3.53 3.31 4.00 2,38 0.787 0.46 
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to become more confident, relative to the 
numeric group, despite their better perform- 
ance in the task. Hypothesis 5 is not supported. 


Practice effects of graphics 
The analysis of variance results for the overall 
experimental model indicated a significant 
group effect at the 0.07 level, suggesting differ- 
ences in performance across the groups as they 
proceeded through the five trials (see Table 3). 
In addition, a significant trial effect suggests im- 
provement in forecast accuracy within all 
groups over time. Providing subjects with re- 
peated practice in the task, coupled with feed- 
back on their performance, appeared to have re- 
sulted in learning. Of particular interest are pos- 
sible differences in learning rates across the 
groups. 
_ Several approaches were used to determine 
whether a significant performance advantage for 
the graphical or combined groups, relative to 
the numeric group, emerged over time. Table 7 
. presents univariate analyses of variance to test 
for differences in group performance in each ex- 
perimental trial. A significant F value is obtained 
in trial 5. Posterior t-tests using Scheffe’s method 
indicate that forecast accuracy was significantly 
better in the graphical group (ft = 2.08, d. f. = 26, 
p = 0.04) and in the combined group (t = 3.10, 
d. f. = 25, p = 0.005), relative to the numeric 
group in this trial. The performance differences 
between the graphical and combined groups 
were not significant. These results suggest some 
support for hypothesis 6. However, since we do 
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not know whether the performance advantage 
of the graphical groups would have been main- 
tained had a sixth trial been conducted, this con- 
clusion regarding hypothesis 6 must be made 
with caution. 

To test for differences in learning rates among 
the three experimental groups (hypothesis 7), 
an analysis of variance procedure was used to 
test the null hypothesis that the profiles of the 
three treatment groups had equal slopes, i.e. that 

- the best-fitting linear functions were parallel. 
Results (in Table 8) suggest mixed support for 
hypothesis 7. A significant difference in the 
linear trends of the profiles for the three treat- 
ment groups was observed. Two-way compari- 
sons of the groups reveal that the difference in 
linear trends occurred between the numeric and 
graphical groups and between the numeric and 
combined groups. These results are consistent 
with the observation of Fig. I which reveals simi- 
lar performance patterns in the two groups re- 
ceiving graphics, and different patterns between 
these two groups and the numeric group. 


DISCUSSION 


The theory of graphics would argue that fore- 
cast accuracy should be superior when reports 
contain a combination of graphical and numeric 
information, rather than numeric information 
alone (Jarett, 1983; Schmid & Schmid, 1979, 
Tufte, 1983). Use of graphics should also de- 
crease extreme recency bias and the tendency of 


TABLE 7. Univariate analyses of variance for tests of the group effect oa forecast accuracy in cach trial 

















Trial Source of variation SS af MS. F Significance of F 

1 Group 0.003 2 0.001 0.010 0.99 
Error 6.781 45 0.151 i 

2 Group 0.349 2 0.174 1.764 0.18 
Error 4.451 45 0.099 

3 Group 0.068 2 0.034 0.315 0.73 
Error 4.451 45 0.099 

4 Group 0.529 2 0.265 2.273 0.12 
Error 4.451 45 0.099 f 

5 Group 0.880 2 0.440 4.642° 0.01 
Error 4.268 45 
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TABLE 8. Summary of analysis of variance for tests on trends 
Groups in : i ; r 
the miodel Sourceofvariation S.S af g MS. F Significance of F 
Numeric, ' 
graphical ' 
and combined Group and trial 
Linear 0.599. 2,45 0.299 3.148 0.05* 
- Quadratic 0.088 2,45° 0.044 0.335 0.72 
Numeric and 
graphical Group and trial 
Linear 0.361 1,30 0.084 4.278 0.047* 
Quadratic 0.088 1,30 0.158 0.559 0.460 
Numeric and 
combined Group and trial A 
Linear 0.523 1,30 0.095 5.511 ; 0.026" 
Quadratic 0.022 1,30 0.104 0.213 0.648 
Graphical and 
combined Group and trial 
Š i Linear 0.015 1,30 0.106 0.142 0.709 
Quadratic 0.022 1,30 0.134 0.165 0.687 





decision makers to rely on a single cue (Cleve- 
land, 1985; Willis, 1987; Wright, 1980). Theory 
would also suggest that practice should improve 
the use of graphical reports (Christ, 1983; Lusk 
& Kersnick, 1979), and user confidence should 
be more positive with graphical, as opposed to 
numeric, reports (Benbasat & Dexter, 1985; Fis- 
choff & MacGregor, 1982). The current study in- 
tended to test these predictions in a setting that 
controlled for extraneous variables and sepa- 
rated factors that otherwise operate together in 
natural settings. _ 

The results of this study provide encouraging, 
but weak, support for the use of graphics as a re- 
porting format. The results suggest limited sup- 
port for the contention that graphical and com- 
bined graphical/numeric reporting formats pro- 
vide an incremental value over a numeric format 
in forecasting financial statement information. 
The incremental value of graphics occurred only 

` after practice in using these formats was pro- 
‘vided to subjects. Learning curves were different 
‘for the graphical and numeric groups; however, 
decision confidence was not higher and recency 
bias. was not significantly lower in the graphical 
groups. 

Various counter-arguments might be raised to 


question the weak results of this study. First, it is 
possible that the five trials provided in this ex- 
periment were insufficient to allow users of 
graphics to meaningfully reduce and maintain 
their forecast error rate. All three groups, how- 
ever, performed reasonably well at the task, 
achieving less than 20% rate of error in their in- 
itial forecasts, and all groups were able to im- 
prove the accuracy of their forecasts over time. 
Nonetheless, further studies might employ more 
trials, though the risk of inducing subject fatigue 
must somehow be avoided. Second, a different 
form of graphical display, other than horizontal 
bar charts, might be argued to be more approp- 
riate for the task. We can find no good a priori 
rationale for expecting vertical bar charts or 
plots to result in better performance on the task, 
although further exploratory work on alterna- 
tive graph designs for forecasting is certainly 
worthwhile. Third, the use of color or other en- 
hancing features combined with graphics might 
be argued to encourage a performance advan- 
tage. Yet, in another study (Carbone & Gorr, 
1985) enhanced graphics were not found to 
yield any advantage over conventional graphs in 
a forecasting task. 3 

The current study could be argued to be li- 
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mited by its confined focus on a specific type of 
financial forecasting problem. The data set 
employed was limited to five variables and 
lacked the natural effects of inflation, competi- 
tion, and sales cycles. Nevertheless, the subjects 
appeared highly motivated during the study, 
exhibiting interest and effort. Thus, the study 
can be argued to have tested subjects’ ability to 
forecast with the different report formats. It is 
possible that graphs would have been more ad- 
vantageous if a sample more naive to accounting 
concepts and financial forecasting had been 
studied. 

The results of the study have implications for 
both accounting practice and research. First, the 
accounting profession should be very cautious 
in expecting better financial decisions if finan- 
cial statement data is provided in graphical for- 
mats. Hence, claims made in the popular litera- 
ture, such as “business graphs have made ... 
publications clearer and more readable, and 
analysis and decision making faster and more ac- 
curate” (Brown, 1984, p. 89), should be consi- 
dered with caution. Nonsignificant effects for 
graphics as a display method have been noted in 
a whole range of decision situations, including 
military decision making, risk assessment, inven- 
tory management, agricultural analysis, crime 
reporting, and medical report analysis (see De- 
Sanctis, 1984; Ives, 1982; Jarvenpaa & Dickson, 
1988). The current study suggests the same re- 
sult may hold in the area of forecasting financial 
statement information, despite a priori expecta- 
tion of the superiority of graphics. Second, al- 
though graphs have been used to present ac- 


counting data in annual reports for many years, 


graphics are still a more novel presentation for- 
mat than tables. In this study, forecast error was 
reduced using graphics, but it took five trials for 
this effect to become evident. The groups with 
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graphical and combined graphical/numeric for- 
mats improved more dramatically than the 
numeric group over time. The different learning 
could be attributed to a gradual breakdown in 
the conditioning bond toward tables. The impli- 
cation is that when accounting data is presented 
in a graphical format, users may go through an 


‘adjustment or a learning process before the 


graphical information becomes meaningful. 

Third, the findings on cognitive recency bias 
have implications for accounting research. 
There was the tendency on the part of all three 
groups to rely exclusively on very few years of 
historical data in formulating their forecasts. 
Graphical formats were unable to meaningfully 
reduce users’ tendency toward extreme recency 
bias. The absence of computational support and 
the presence of time pressure may in part ac- 
count for this result. Nevertheless, it is‘surpris- 
ing that the participants did not try to rely at all 
on moderate or long-term historical trends, par- 
ticularly in those groups that were given graphi- 
cal displays. Hence, further research should, in 
particular, examine the current failure of 
graphics to reduce extreme recency bias and 
over reliance on cues. 

Perhaps the most worthwhile avenue. for 
further study concerns the ability of visual dis- 
plays to improve the cognitive heuristics used 
by decision. makers. More systematic study of 
bow graphically displayed accounting data is 
cognitively processed by users, and differences . 
in these processes between effective and ineffec- 
tive users of financial reports, might yield insight 
into (1) the precise conditions in which graphs 
are and are not effective as data display tools, (2) 
how users might be trained, or (3) how graphs | 
might be altered so as to increase the power of 
graphically-based methods for presenting ac- 
counting data. 
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Abstract 


This study adds to research which examines the construct validity of coefficients of cue importance in 
studies concerned with how decision-makers use accounting information in formulating judgments (see 
Larcker & Lessig, The Accounting Review, January, 1983, pp. 58-77; Selling & Shank, Accounting, 
Organizations and Society, 1989, pp. 65—77). Historically, accounting studies have modelled cue 
importance with almost exclusive reliance upon linear models. But as the Selling & Shank study indicates, 
inferring the importance of accounting cues through reliance upon only one kind of model can leave 
“method variance” undetected and raises threats to the construct validity of coefficients (see Cook & 
Campbell, Quast-experimentation: Design and Analysts Issues for Field Settings, 1979; pp. 59-70). To the 
extent that cue importance appears similar across models, then the model coefficients are presumed more 
valid. While Selling & Shank compare linear models to process tracing models, we compare a linear model 
to an eigenvector-scaling routine known as the Analytic Hierarchy Process (see Saaty, The Analytic 
Hierarchy Process, 1980). As with Selling & Shank, we find that the importance of cues is sensitive to model 
choice, suggesting that more research is needed into method variance before Judgments can be made with 


respect to the construct validity of linear coefficients in accounting studies. 


Studies of human judgment are always paramor- 
phic; that is, they can at best produce models 
that resemble rather than reproduce cognitive 
processes. Given that, it becomes important to 
insure that the coefficients of cue importance 
produced by a specific model are valid proxies 
for cognitive ‘importance. One threat to con- 
struct validity is method variance that can cause 
coefficients of cue importance to be sensitive to 
the experimental and modeling techniques used 
to elicit them. For example, the importance of 
accounting cues in bankruptcy predictions has 
been shown to differ depending upon whether 
linear or process-tracing models are used by ex- 








perimenters (see Selling & Shank, 1989). While 
such differences in cue importance pose only a 
minor threat to predictive ability, they ‘do 
threaten the validity of inferences about a sub- 
ject’s cognitive processing (Larcker &. Lessig, 
1983, p. 74). Because accounting has relied al- 
most exclusively upon linear models in policy- 
capturing research, and, because that research is 
presumed to address important practical deci- 
sions, further inquiry along the lines of Selling & 
Shank into the method variance that might be 
prevalent in accounting research is warranted 
(see Cook & Campbell, 1979, pp. 59~70). 
Methodological constraints are threats to con- 


*Our appreciation extends to Al Schepanski, Frank Schmidt and participants in The Florida State University Accounting 
Workshop for their helpful comments. The financial assistance of The University of Iowa College of Business is greatly 
appreciated. Finally, we thank Jay Semel and our colleagues at lowa’s University House. 
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struct validity, and construct validity is a threat 
to inferences. As Selling & Shank state: 


Since one role of accounting research is to study how ac- 
counting information is used by decision makers, studies 
of cue importance are critical in evaluating the efficacy of 
various accounting systems. The current study de- 
monstrates that process tracing studies can yield much 
different insights about cue utilization from linear mod- 
els studies (1989, p. 77). 


This study extends their research, but we de- 
ploy an eigenvector-scaling technique known as 
the Analytic Hierarchy Process (AHP hereafter ) 
rather than process-tracing as a basis for com- 
parison to the linear model (see Saaty, 1980). 
We replicate the lens model study of Jiambalvo 
et al, (1983) and compare the coefficients of cue 
importance derived from that replication to 
coefficients we derive from AHP. We find differ- 
ences between the models, thus adding to Sel- 
ling & Shank’s concern with method variance, a 
concern also studied in Einhorn eż al. (1979) 
and Larcker & Lessig (1983). Our approach dif- 
fers from these studies in that it uses AHP rather 
than process tracing as an alternative to a linear 
model, and the decision context of our research 
is personnel evaluation judgments in C.P.A. firms 
(see Wright, 1980; Watson, 1975; Reinstein & 
Smith, 1983; Jiambalvo et al, 1983; Kida, 1984; 

` Blocher, 1980; Jiambalvo, 1979). This is an im- 
portant area of judgment to the profession, and, 
given that importance, inferences about judg- 
‘ments from paramorphic models should be con- 
ditioned by tests of method variance. This paper 
contributes some results which are another step 
in that conditioning. Our study also offers a 
novel methodological strategy by triangulating 
the coefficients of the two models with self-in- 
sight coefficients correlationally, a technique 
that we think might be useful to future resear- 
chers concerned with method variance. 


MODELING OF THE PERFORMANCE 
EVALUATION PROCESS 


While the primary purpose of this study is 
multimethod comparison of cue importance, it 
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is necessary to explain briefly the context of per- 
formance evaluation studies. Researchers were 
first concerned with the self-insight of 
evaluators, with self-insight understood as the 
goodness-of-fit between what cue evaluators say 
are important to them and how well those cues 
account for the actual decisions that they make. 
Typically, evaluators are asked to assign impor- 
tance rankings to different cues, and these rank- 
ings become “subjective weightings” which 
attach to each cue. Later, researchers began to 
rely upon linear regression where the cues 
served as independent variables for an overall 
performance score as the dependent variable. 
The regression coefficients are used as a set of 
objective weights indicating the relative impor- 
tance of each cue to the overall decision. The 
best example of a study which examines object- 
ive and subjective weights in performance 
evaluation is perhaps the Jiambalvo et al. (1983) 
study. 

An interesting feature of this research is the 
way in which the regression-driven objective 
weights can be compared to the self-reported 
subjective weights of evaluators. To the extent 
that the weights are similar, the evaluator is said 
to demonstrate self-insight; to the extent that the 
weights are different, the standard conclusion is 
that self-insight is poor. The status of self-insight 
in this research is consistent with other psycho- 
logical theory. In Jiambalvo et ai’s terms: “. .. we 
would reach the standard conclusion (Slovic & 
Lichtenstein, 1971) that individuals overesti- 
mate the importance of minor [performance 
characteristic] categories and underestimate the 
importance of a few major categories” (1983, p. 
21). In short, if we accept the regression coeffi- 
cients are descriptively valid respresentations of 
the importance of cues, then we can say that sub- 
jective weights are biased with respect to ob- 
jective weights in ways which reflect a tendency 
to view. relatively minor factors as more import- 
ant and relatively major factors as less important 
than they actually are. 

- But an interesting question emerges if we 
challenge the validity of regression techniques. 
to elicit objective weightings. While researchers 
are aware that the use of regression models is it- 
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self a heuristic device, a kind of normative dis- 
course still takes place with respect to the qual- 
ity of subjective weights; that is, we measure the 
quality of self-insight based upon how well the 
subjective weights map to the regression-gener- 
ated objective weights. However, regression is 
only one among many ways to model a decision 
process objectively, and alternative models dif- 
fer greatly both in terms of elicitation tech- 
niques and in terms of computorial processes to 
produce the weights. It may very well be that re- 
gression produces vastly different weights than 
other, equally objective, techniques, as Selling & 
Shank (1989) suggest. 

This study employs AHP as an alternative to 
linear regression models. The AHP technique 
has been widely used in modeling decision- 
making (see Saaty, 1980, and Zahedi, 1986, for 
survey papers). AHP was first introduced to the 
accounting literature by Patton et al (1982), fol- 
lowed by Arrington et al (1984), along with Lin 
et al (1984). The AHP cue weights are com- 
pared to the weights obtained from regression 
and both are compared to the subjective weights 
provided by subjects.’ It is hoped that the re- 
search can generate additional insights into the 
threats of method variance. 


THE ANALYTIC HIERARCHY PROCESS 


AHP is designed to address decision-making 
questions that are complex, ambiguous, difficult 
to quantify, and involve multi-attribute prefer- 
ence rankings (see Saaty, 1980). Examples of its 
application include energy policy, health care, 
defense strategies, capital budgeting, manufac- 
turing design, investment policy, and education 
(Zahedi, 1986). As with regression and other 
forms of paramorphic decision models, AHP pro- 
duces objective coefficients that perhaps are 
valid constructs for the importance that deci- 
sion-makers assign to different cues that mediate 
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complex judgments.:-AHP also has some robust 
properties with respect to scaling; as Arrington 
et al state, “The technique is particularly suited 
for dominance data which extant multidimen- 
sional scaling techniques would not consider as 
cardinally ordered” (1984, p. 309). As opposed 
to, say, regression-based models, AHP does not 
require that the decision-maker respond to car- 
dinally-scaled decision data (like numerical 
weights in regression tasks). It requires only that 
the attributes being modeled by pairwise com- 
pared and ranked on a 1 to 9 scale with respect 
to pairwise importance. In this way, the cogni- 
tive schemas employed in AHP are likely to be 
vastly different from those employed in regres- 
sion tasks since the information provided and 
the required responses are quite different (see 
Nisbett & Ross, 1980). Thus not only the algoris- 
tic procedures used to develop the weights but 
also the context of decision-making can create 
differences across coefficients in models of the 
decision task. 

Since AHP has been introduced to the 
accounting literature, we will not provide a de- 
tailed pedagogical explanation of the proce- 
dures involved in the technique. The interested 
reader is referred instead to Saaty (1980) or to 
the Appendix of the Arrington et al (1984) 
study. [For an overview of current AHP research, 
see the special issue of Socid-Economic Plan- 
ning Services (Vol. 20, No. 6, 1986), which is de- 
voted to AHP studies in a variety of disciplines. ] 
Briefly, AHP begins with identifying attributes 
likely to be important in making unstructured, 
qualitative decisions. Further, AHP has the 
capacity to structure hierarchically categories of 
attributes and perform the necessary matrix 
manipulations to produce hierarchical decom- 
position of the overall decision. 

For example, in the Arrington et al study, one 
level of the hierarchy was composed of the five 
attributes that auditing standards identify as im- 
portant in the selection of analytical review pro- 


| The statistical comparisons here are less important and less reliable than in most research. The reason for this is that AHP 
is inherently more reliable in that standard errors are smaller. In our study, for example, it would take a very large number 
of observations (on the order of 400) to achieve the same reliability as AHP. But the important point is an operational one 
— it is simply infeasible in cognitive research to generate a 400-observation study. 
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cedures (statistical performance, model cost, 
model robustness, ease of application, and un- 
derstandability). After generating importance 
rankings for each of these attributes, subjects 
were asked to evaluate each of five analytical 
review models on each of the five attributes 
identified above (those models were random- 
walk, random-walk with drift, two forms of re- 
gression, and Box—Jenkins techniques). 
Through eigenvector-scaling techniques, a 


paramorphic model of preferences for different 


procedures was produced, and those prefer- 
ences were driven by both the subject’s evalua- 
tion of the importance of each attribute and the 
subject’s perception of the desirability of each 
statistical model across those attribute-impor- 
tance rankings. 
In this way, preferences for models are under- 
_Stood in terms of the importance of each attri- 
pute and the ability of a specific model to accom- 
modate that attribute. This ability to structure 
the decision space hierarchically is a major ad- 
vantage of AHP (and of process tracing) over 
other techniques as it considers multiple levels 
of decision variables and how they interrelate. 
Since we are primarily interested in cross- 
methodological comparision of AHP to regres- 
sion, we employed only one level of attributes. 
However, the performance evaluation process 
in C.P.A. firms is sufficiently rich that this study 
could easily be extended to multiple levels of 
the decision space. The steps involved in an AHP 
application will become clear in the following 
discussion of the judgment task. 


RESEARCH METHODOLOGY 


As mentioned previously, this study replicates 
the Jiambalvo et al (1983) study but extends 
the analysis to examine the objective cue 
` weightings obtained from the use of AHP rather 


than regression to elicit judgments. As such, the 


reader is referred to the Jiambalvo et al study for 


detailed discussion of the regression and self-in- _ 


sight procedures employed. Our discussion will 
concentrate on what is unique to this study; 
namely, the AHP procedurfes. We do not omit 
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discussion of the regression phase of the study 


-because it is unimportant; rather, it is in the in- 


terest of efficiently explaining our results by 
avoiding needless and repetitive methodologi- 
cal discussion. 

The questions we are interested in are (1) for 
a given task, are the objective cue weights ob- 
tained with regression techniques similar to 
those obtained with eigenvector-scaling tech- 
niques like AHP?; and, (2) if we also consider 
self-insight weights, what correlations emerge 
from triangulating regression, AHP, and self-in- 
sight? The first question is important in the sense 
of providing information about the sensitivity of 
decisions to the elicitation techniques 
employed to produce them. If decisions are sen- 
sitive in this way, then construct validity be- 
comes an important concern. 

The second question is important since an ar- 
gument can be made that self-insight should not 
be normatively evaluated against benchmark 
models unless the distribution of associations 
between self-insight and various models is uni- 
formly distributed across models. If not distri- 
buted uniformly, then the conclusions about 
poor self-insight, bias, overestimation and unde- 
restimation of cue importance are suspect if 
used to suggest that bootstrapping or other pro- 
cedures should be used to help subjects over- 
come supposed deviations from objective 
weights. 


Subjects 

All subjects were contacted either in person 
or over the telephone and agreed to participate 
in the study. Because we are interested in policy- 
capturing issues, it is important to use subjects 
who influence firm policies with respect to per- 
sonnel evaluation and who could be considered 
experts in the area. As such, five national 
partners in charge of personnel for unique Big 
Eight firms participated along with three 
partners with regional personnel respon- 
sibilities. These partners not only are primarily 
engaged in personnel activity but are in posi- 
tions of developing and influencing firm-wide 
performance evaluation policies. Thus, while 
the subject pool is small, it is a quite powerful 
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representation of influence over personnel 
policies. The small size is also representative of 
most AHP studies since the technique requires 
considerable time and effort and does not lend 
itself to aggregating decision models across indi- 
viduals in the interest of inference. 


Tbe Task and the factors 

As with Jiambalvo et al, the task facing the 
subjects was to evaluate a hypothetical senior. 
The eight factors used as a basis for that evalua- 
tion were the same in both studies and are de- 
rived from the descriptive work of Watson 
(1975). The factors are presented and defined in 
Fig. 1 which is reproduced in the same form as it 
was presented to the subjects. 


The instrument 
The instrument was mailed to each partner. 
The instructions to subjects were as follows: 
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In the pages that follow, you will find eight attributes that 
can be used to rate job performance. You will be asked to 
assume that you are conducting an annual review of a 
hypothetical audit senior within your firm. 

For the eight attributes, you will be asked to rate how 
important each attribute is relative to the other seven at- 
tributes; e.g, “To what extent is a senior’s ability to exer- 
cise judgment more or less important than his or her abil- 
ity to work with others?” Assume that your evaluation is 
global in nature and will be used as input to promotion, 
retention, job. assignment, counseling, and salary deci? 7 
sions. : ` 
In addition, the instrument included a cover 

letter explaining the study, an AHP section, a re- 
gression section, a self-insight section, and a. 
background information and demographics sec- 
tion. (The complete instrument is available from 
the authors.) Since the AHP section is unique to 
this study while the regression and self-insight 
sections are identical to those employed in the 
Jiambalvo et al. study, only the AHP section will 


The eight attributes listed below’ have been developed from analyzing CPA 
firms’ performance evaluation forms, consultations with members of those 


firms, and research literature dealing with performance evaluation, 


Since 


these attributes may not be the exact ones used in your firm's performance 
evaluation reports, please pay close attention to the definitions of the 
attributes because you will use the terms throughout this section. 


E; RESPONSIBILITY 


2. ORGANIZATIONAL ABILITY 
3, PROBLEM SOLVING 

4, CLIENT RELATIONS 

5. CREATIVITY 

6. TECHNICAL COMPETENCE 
7. WORKING WITH OTHERS 


8. JUDGMENT 


Willingness and ability to accept 
responsibility. 


Ability to effectively utilize staff 
and plan work assignments. : 


Ability to identify and develop 
practical, workable solutions to 
problems. f 


Ability to win the confidence and 
respect of clients. ; 


Level of creativity exhibited in 
adapting to unique problem 
situations. 


Knowledge and experience reflected 
in adherence to known and accepted 
procedures and principles. 


Ability and desire to work 
effectively with people. 


Level of judgment exercised. 


Fig. 1. Definitions of the performance evaluation attributes. 
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be discussed. This is followed by a brief discus- 
sion of the regression task and the self-insight 
task. Since those two tasks are replications of the 
Jiambalvo et al. study, the reader is referred to 
their paper for detailed description of those 
parts of the instrument. 


The AHP procedure 

* The AHP procedure required subjects to iter- 
ate through the 28 pairwise comparisons of the 
eight factors used in the Jiambalvo et al. study. 
The order of presentation of the pairwise com- 


In responding to the questions in 
following scale. 
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parisons was randomized. For each pair, the sub- 
ject assigned a value from 1 to 9 beside one of 
the two factors indicating the relative impor- 
tance of that factor with respect to the other fac- 
tor in the pair. An example of the scale and of the 
elicitation procedure is illustrated in Fig. 2. It il- 
lustrates that this hypothetical subject views re- 
sponsibility as “of essential or strong impor- 
tance” (“5” as the response) over creativity in 
evaluating the hypothetical senior employee. 
Similarly, the response of “7” in the second com- 
parison indicates that ability to work with others 


this case, you should use the 





Intensity of 





Importance Definition Explanation 

1 Equal importance Two activities or items contribute 
equally to the objective, 

3 Weak importance Experience and judgment slightly 

of one over another favor one activity or item over 
another. 

5 Essential or strong Experience and judgment strongly 

importance favor one activity or item over 
another. 

7 Demonstrated An activity or item is strongly 

importance favored and its dominance is 
demonstrated in practice. 

8 Absolute importance The evidence favoring one activity 
or item over another is of the 
highest possible order of 
affirmation. 

2,4,6,8 Intermediate values When compromise is needed. ` 


between the two : 
adjacent judgments 


Sample Responses 


5 Responsibility: 


Creativity: 





Client Relations: Ability to Work With Others Z 


[The instrument continued until all 28 pairwise comparisons were completed. } 


Fig. 2. AHP elicitation technique response scale. 
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is of “demonstrated importance” over client re- 
lations. 

Note how different this task is from regres- 
sion-based elicitation techniques. In regression, 
subjects are given a series of hypothetical 
employees with numerical values assigned as 
weights to each of the cues for each employee. 
The subject tacitly and contemporaneously 
combines the weights and cues into an overall 
judgment, usually reported as a numerical rat- 
ing. These ratings serve as dependent variables 
in conducting the regression. 

In AHP, there are no hypothetical employees 
and no numerical weights assigned as scores on 
each cue. Instead, the attributes are reported 
nominally and the subject scores the compari- 
sons sequentially. We point this out not to indi- 
cate that AHP. and regression are sufficiently 
different in the task requirements imposed on 
subjects to suggest that any differences in the ob- 
jective weights obtained under the two methods 
might be due to demand characteristics of the 
respective methods. 


l 


The regression procedure 

The regression procedure was identical to the 
procedure employed in the Jiambalvo et al 
(1983) study, and we are grateful to them for 
providing it. Each subject was given 24 case pro- 
files of hypothetical senior accountants. For 
each factor within each case, randomly gener- 
ated numerical scores of 64—100 were provided 
in ways which insured the absence of multicol- 
linearity across the factors. Subjects provided 
overall performance evaluation scores between 
1 and 100 which served as the dependent mea- 
sures from which the regression coefficients 
were obtained. 


Self-insight 

The self-insight section of the instrument was 
very simple and merely required subjects to allo- 
cate a total of 100 points based upon their per- 
ceptions of the relative importance of each of 
the eight factors in their evaluation decisions. 
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These points served as subjective weights in the 


data analysis. 


Control issues 

Several control issues were at work in the de- 
sign. First, the order of pairwise comparisons for 
the AHP section was altered across the eight sub- 
jects. This avoids the possibility that order ef- 
fects might have influenced the AHP preference 


.fankings systematically. Second, all subjects per- 


formed the AHP task first. This minimized the 
possible effects of differences across subjects 
which might have been due to differential learn- 
ing effects which could result from having some 
subjects perform the AHP or regression in oppo- 
site orders. Third, within the AHP procedure, 
some pairwise comparisons were repeated to 
provide a reliability check on responses. Inspec- 
tion of these repetitive comparisons revealed 
very little deviation in the redundant responses. 


DATA ANALYSIS 


Effects of elicttation techniques 
The major purpose of this study is to deter- 
mine the consistency of cue weightings elicited 


under AHP, regression, and self-report. Table 1 


presents for each subject the normalized cue 
weightings attached to each factor under each 
elicitation technique along with the ordinally- 
scaled ranking of each cue in terms of impor- 
tance. To determine whether the normalized 
weightings differed significantly across the three 
techniques, Friedman ANOVA on the rankings 
was performed for each subject. For each sub- 
ject, the null hypothesis of equal ranks was re- 
jected at p < 0.03. Thus, overall, the subjects’ 
rankings of each cue are not independent of the 
elicitation technique employed. 


A comparison of AHP and regression weight- 
ings 

Recall that the validity of regression coeffi- 
cients as objective cue weightings motivated 
our interest in employing the alternative AHP . 
objective procedure. If weightings are sensitive 
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PARAMORPHIC MODELS OF ACCOUNTING JUDGMENTS 


to the model employed to elicit responses, then 
it becomes important to investigate alternative 
models like AHP in addition to continuing the 
ongoing research which relies upon regression 
exclusively. Table 2 provides Spearman rank- 
order correlations between each pair of elicita- 
tion techniques. With respect to correlations 
between the objective weights of AHP and re- 


gression, the rankings for only one of the eight 


expert subjects are significantly correlated. 

Thus the sets of objective weights appear to be 
sensitive to the elicitation technique employed 
to produce them. 


TABLE 2. Pairwise comparisons of relative rankings 











Spearman rank order correlations 








AHP and AHP and Regression and 

Subject regression self-insight self-insight 

1 —0.015 0.000 0.454 

2 0.140 0.850* 0.614 

3 —0.015 0.332 0.730°* 

4 0.824°* 0.917* 0.807** 

5 0.471 0.625 0.890° 

6 0.454 0.741** 0.421 

7 0.559 0.669** 0.338 

8 0.515 0.768** 0.923* 
Average 0.367 0.613 0.647 


* Significant at the a = 0.01 level. 
** Significant at the a = 0.05 level. 


Correlation of subjective and objective weights 
The divergence between objective cue 


‘weights under AHP and regression raises in-. 


teresting questions about the validity of evaluat- 
ing self-insight against regression weights; that 
is, the validity of conclusions about the quality of 
self-insight might be regression-specific. Table 2 
indicates very little difference, however, in the 
correlation of regression and self-insight as op- 
posed to the correlation of AHP and self-insight. 
Five subjects yield significant correlations be- 
tween AHP and self-insight, and four yield signifi- 
cant correlations between regression and self- 
insight. At this level of analysis, then, it would 
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appear that the model used to generate object- 
ive weights makes little difference. 

But what is interesting here is the pattern of 
these significant correlations. For five of the 
eight subjects, self-insight is correlated with dif- 
ferent objective models. If we use regression, for 
example, we might conclude that subjects 1, 2,6 
and 7 have poor self-insight (nonsignificant cor- 
relations). On the other hand, if we use AHP, 
then subjects 1, 3, and 5 have poor self-insight. If 
we use the more stringent standard that self-in- 
sight should correlate with both models, then 
only subjects 4.and 8 demonstrate self-insight. 
The point here is that the self-insights of subjects 
is evaluated differently depending upon the 
model deployed to produce objective weights. 
Why this is the case, why self-insight maps to ob- 
jective weights differently for different subjects, 
seems a fruitful area for farther research. 


Ancillary analyses 

Some procedures were conducted that are of 
indirect interest to the research questions that 
motivate this study. We were interested in 
whether the regression coefficients and self-in- 
sight coefficients for our subjects differed from 
those reported in the Jiambalvo et al study. 
Using nonparametric analysis, we tested the cor- 
relation between the regression and self-insight 
ranks between the two studies. The rankings 
across the two studies were not significantly dif- 
ferent.” 

As a final point, interrater reliability was calcu- 
lated for the regression responses and ranged 
from a high of 0.559 to a low of 0.402 calculated 
as Kendall’s correlation coefficient. These 
results are similar to those in other studies 
suggesting a moderate degree of consensus ac- 
ross subjects about the relative importance of 
the factors. 


LIMITATIONS OF AHP 


Like any paramorphic modeling technique, 
AHP has been scrutinized from both axiomatic 


? Only six of the eight participants in this study were C.P.A.s with significant experience in auditing. Two of the subjects held 
strictly personnel-related positions. Since all of the subjects in the Jiambalvo et al. study were practitioners, we excluded the 


two subjects in our study mentioned above. 
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and practical perspectives. Practically, given its 
reliance on pairwise comparisons, AHP does not 
easily handle large numbers of attributes. 
Axiomatically, four criticisms have been di- 
rected at AHP; as Harker & Vargas (1987) state: 


There are essentially four areas in which the AHP has 
been criticized: lack of an axiomatic. foundation, am- 
biguity of the questions that the decision-maker must 
answer, the scale used to measure the intensity ofthe pre- 
ference, and the Principle of Hierarchical Composition 
and rank reversal (p. 1384). 


Harker & Vargas address each of these criti- 
cisms in detail. We refer the interested reader to 
their work and would only point out that, while 
there are unique aspects of AHP, many of the 
criticisms of AHP either impose axioms from 
quite alien models on AHP or develop critiques 
that are sui generis applicable to any policy-cap- 
turing method and thereby not unique to AHP. 
As a final note, Jensen (1984) provides useful in- 
sights into the apparent strengths and weaknes- 
ses of AHP vis-a-vis other methods, particularly 
least-square methods. 


SUMMARY AND CONCLUSIONS 


The major contribution of this study is to de- 
monstrate that modeling the performance 
evaluation process in C.P.A. firms is sensitive to 
the elicitation technique employed and to the al- 
gorithms used to determine objective weights. 
We.are in no sense willing to engage in argu- 
ments over the superiority of one method when 
compared to another; we are, only suggesting 

that methodological sensitivity of results exists. 
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Thus, researchers should be careful about con- 
clusions that have to do with the quality of self- 
insight when compared to benchmark models 
and conclusions that assume the construct valid- 
ity of objective weights. 

Much more research is necessary into not 
only the efficacy of regression, AHP, and process 
tracing (as with Selling & Shank) models of per- 
formance evaluation, but also other possible 
models with equally legitimate claims to objec- 
tivity. In addition, this study suggests that the 
manner in which responses are elicited may lead 
subjects to different conclusions, and research is 
needed into which types of elicitation subjects 
feel most comfortable in utilizing. For example, 
while AHP may take longer since it requires so 
many pairwise comparisons, it is also cognitively 
simpler since it frees the subject from contem- 
poraneously processing all factors and from the 
necessity of processing numerical weights that 
attach to each factor. We simply know very little 
about what effects these differences have. 

The primary purpose of this study is 
methodological concern with construct validity. 
To that end, we would extend Selling & Shank’s 
conclusions about comparisons between pro- 
cess-tracing and linear models to our AHP com- 


parisons. As they state: 


Given the methodological refinements in this study, the 
fact that it yields cue importance results which are not 
comparable to those of the linear models, and the possi- 
bility of adapting the methodology here to more com- 
plex and realistic problem contexts, there is a strong 
suggestion that future research should further probe the 
question of cuc importance ... Our primary conclusion 
is that cue importance as measured by linear model coef- 
ficients is not highly correlated in our sample with other 
measures... Assessing cue importance remains a very im- 
portant open topic (1989, p. 77). 
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Abstract 


This paper examines the initial planning processes of auditors, a research topic which has received very 
limited attention to date. In this empirical study auditors were given the results of analytical tests and asked 
to assess the likelihood that observed “abnormalities” were due to accounting error or irregularity 2s 
distinct from environmental change (i.e., an initial hypothesis). They were further asked to indicate the 
information they would seck in response to the test results, in an effort to resolve the issue adequately to 
proceed with planning the audit. The authors posit that: (1) for relatively inexperienced auditors, their 
initial hypothesis of cause will positively correlate with the type of information the auditors will 
subsequently seek (ic. they will follow a hypothesis confirming strategy); and (2) for relatively 
experienced auditors, their initial hypothesis of cause will not correlate with the type of information sought 
but rather they will follow a balanced information search strategy. Support for this hypothesis was found 
using 2 rank measure of information seeking. A second measure of information seckirig based on unranked 
questions, did not support the ia however. 
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Felix & Kinney (1982) provide a general over- 
view of the auditors’ opinion formulation pro- 
cess. This process is viewed as being comprised 
of a series of sequential steps beginning with an 
“orientation” stage and concluding with the au- 
ditor. providing a report. After reviewing the 
ivailable literature relating to the opinion for- 
nulation process, Felix & Kinney conclude that 


descriptive research on the auditors’ initial plan- - 


aing processes is virtually nonexistent. How- 
zver, the importance of the initial planning pro- 
cesses has been recognized for quite some time. 
‘For example, over twenty years ago Mautz & 
Sharaf (1961) discussed the significant role that 
initial planning processes play in the conduct of 
an audit. | Since the publication of the paper by 
Felix & Kinney (1982), studies by Kaplan & Re- 
ckers (1984) and Libby (1985) have initiated 
work exploring different aspects of the auditors’ 
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initial planning processes. This paper builds 
upon both the previous works on initial planning 
processes and the psychology of judgment in au- 
diting. Gibbins (1984) and Waller & Felix 
(1984) review the recent literature in social 
cognition and discuss how this literature relates 
to professional judgment in auditing. Gibbins 
speculates that auditors, because they employ 
cognitive structures, will have response prefer- 
ences which are more conservative than the en- 
vironment. Gibbins defines a conservative re- 
sponse preference as one that is more stable than 
the environment. 

In this paper we provide evidence on the ex- 
tent to which auditors’ judgments are conserva- 
tive and whether more experienced auditors 
reach more conservative judgments than less ex- 


- perienced auditors.. More specifically, the paper 


reports the results of a study examining the rela- 


540 


tionship between auditors’ initial beliefs about 
the likelihood of material error in a set of 
financial statements, and auditors’ information 
seeking judgments. If auditors’ response prefer- 
ences are stable relative to the environment (i.e., 
conservative) then their initial beliefs are not ex- 
pected to greatly influence information seeking 
processes or judgments. 

Subjects in the study were professional au- 
ditors who were given an analytical review task. 
The analytical review task was chosen because it 
represents a source of evidence available to the 
auditors during initial audit planning stages. 
Further, Waller & Felix note that the role of ex- 
perience-based knowledge may be particularly 
important for analytical review judgments. That 
is, it is a diagnostic task to which the auditor may 
bring differential amounts of task related knowl- 
edge and this differential knowledge may lead to 
different decision processes and judgments be- 
tween experienced auditors. 

The next section will discuss diagnostic deci- 
sion processes and the analytical review judg- 
ment, and concludes with the experimental 
hypothesis. The following two sections will 
describe the empirical study which was con- 

-ducted and detail the results. The last section 
discusses the results of the study. 


ANALYTICAL REVIEW AND DIAGNOSTIC 
DECISION MAKING 


During the initial planning stages of the audit 
the auditor may use the results of analytical tests. 
Although professional pronouncements do not 
require auditors to use analytical tests, three of 
the firms examined by Cushing & Loebbecke 
(1983) required analytical tests during the plan- 
ning stage. The other firms which do not require 
analytical tests in planning were also observed to 
frequently use it, nonetheless. Biggs & Wild 
(1984) report that 89% of the auditors they sam- 
pled used standard ratio analysis. All offices of 
the “Big Eight” from which we enlisted our sub- 
jects indicated that they routinely used standard 
ratio analyses. : 

Analytical review procedures applied during 
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the initial planning processes of the audit (ie., 
the first two steps of the Felix & Kinney model) 
have two purposes. The first purpose is for the 
auditor to forge expectations with respect to the 
conduct of the audit. These expectations relate 
to the statement balances and the underlying 
economic events in the firms that affect these ba- 
lances. An auditor will also develop expectations 
concerning the nature and quality of the firm’s 
accounting system and all the related dimen- 
sions of internal control and management phil- 
osophy and integrity. 

The second purpose served by applying 
analytical review procedures during the initial 
planning processes is to help assess the likeli- 
hood of material errors in the various compo- 
nents of the client’s accounting system. That is, 
analytical review procedures may facilitate the 
auditors’ evaluation of the error generating 
propensities of the clients accounting system. 
Ratio analyses may lead to the gathering of other 
information and the final evaluation an auditor 
reaches at this stage is important because it will 
influence the auditors’ tactical planning deci- 
sions. That is, Felix & Kinney posit that the 
nature, extent, and timing of planned auditing 
procedures will be different when errors are as- 
sessed to be high as opposed to low. 

The ability of analytical tests to provide the au- 
ditor with valuable initial evidence is suggested 
by authoritative pronouncements as follows: 


A basic premise underlying the application of analytical 
review procedures is that relationships among data may 
reasonably be expected by the auditor to exist and con- 
tinue in the absence of known conditions to the contrary 
[AICPA, 1984, par. 318.03]. 


Hylas & Ashton (1982) provide empirical sup- 
port for the ability of knowledge obtained from 
analytical review procedures to detect material 
errors in financial statements and to direct sub- 
sequent audit investigations. In their study, they 
examined the process that initially signaled that 
an error may have occurred. The data base in- 
cluded 281 errors requiring financial statement 
adjustment on 152 audits. The authors found 
that over 27% of all errors were initially de- 
tected by analytical review procedures. 
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Libby (1985) argues that analytical review 
judgments are an example of diagnostic decision 
making that involves three stages. First, the au- 
ditor is faced with either expected or (more in- 
terestinglv) unexpected financial statement re- 
lationships. Second, if unexpected relationships 
appear, the auditor develops “a series of plausi- 
ble diagnoses and a strategy for gathering further 
information.” Lastly, the diagnoses and a strategy 
for gathering further information will lead to a 
final diagnostic decision. 

Libby argues that the key to understanding the 
auditor’s diagnostic model of professional judg- 
ment appears to be an understanding of the in- 
teraction of the environmental cues and the au- 
ditor’s knowledge structure. Libby’s position is 
grounded in recent work in social cognition 
which suggests that knowledge structures (e.g., 
schemas) appear to play a significant role in the 
enactment of much behavior (Wyer & Gordon, 
1984; Rumelhart & Norman, 1984; Lord & 
Smith, 1983; Gioia & Manz, 1985). 

A schema (or schemata) is a knowledge struc- 
ture that people use to organize and make sense 
of social and organizational information or situa- 
tions. Examples of schema include stereotypes 
(Hamilton, 1979), prototypes (Cantor & Mis- 
chel, 1977), implicit theories (Brief & Downey, 
1983; Schneider, 1973) causal schemata (Kel- 
ley, 1973), and frames (Minsky, 1975). Most of 
the above schema serve to categorize and inter- 
relate information. In auditing, for example, Wal- 
ler & Felix (1984) identify two roles for 
schemas: 


(1) Schemata organize experience into generalized cog- 
nitive structures that represent knowledge about how 
the world works ... 

(2) Schemata drive the selection and comprehension of 
environmental stimuli, (1984, p. 388). 


That is, the above varieties of schema are cogni- 
tive frameworks for understanding that may 
suggest implications for behavior but are not 
generally considered as guides to behavior. 
There is one schema, a script, however, that re- 
tains knowledge of expected sequences of be- 
havior, actions and events (Abelson, 1976, 1981; 
Gioia & Poole, 1984). A script is concerned with 


both understanding the environment and guid- 
ing one’s behavior in specific contexts and situa- 
tions. 

Abelson (1976) proposes that the develop- 
ment of scripted understanding and behavior 
evolves through three stages. The first, episodic 
scripting, is elemental and is retained as a con- 
text-specific remembrance of a single experi- 
ence. When a person has experienced many 
similar episodes in similar situations, the collec- 
tion of episodic scripts evolves into a categorical 
script — a script appropriate for still a relatively 


“narrow class of situations. Finally, when enough 


experience is acquired and generalized across 
contexts, a generalized script or metascript to 
guide behavior in a range of related situations 
develops. 

The description offered by Abelson is consis- 
tent with the speculation of Waller & Felix con- 
cerning the manner in which auditors learn from 
experience. Their central thesis is that learning 
from experience involves the formation and de- 
velopment of cognitive structures (1984, p. 
386). They state: 


for the most part, the auditor's cognitive structures that 
represent his or her knowledge of the practice of audit- 
ing, and that drive his or her perceptions and judgments, 
are the product of experiential actions and observations, 
(1984, p. 398). i 


Kida (1984a) provides evidence related to au- 
ditors’ use of cognitive structures, In his study, 
Kida gave subjects information of equal diagnos- 
tic value that was either causal or noncausal. In- 
formation that was causal was intended to bring 
to mind a cause and effect sequence that the au- 
ditor had previously learned whereas the non- 
causal information was not intended to have this 
effect. He reports that auditors placed greater re- 
liance upon causal than noncausal data in reach- 
ing decisions regarding the likelihood of failure 
or return on investment for a firm. 

The nature and environment of auditing 
suggest that the formation and development of 
scriptual schema by auditors will be a gradual 
process probably stretching over many years. 
Waller & Felix (1984) discuss several reasons 
for this. Among them, they note that the learning 
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process (of “what goes with what”) of auditors is 
similar to the learning processes of nonprofes- 
sionals, that is, it may not be efficient, and thus 
not rapid. Waller & Felix note that contributing 
to this phenomenon is the pragmatic nature of 
many auditing judgments. Many auditing judg- 
ments lead to a lack of sufficient timely outcome 
feedback for the auditor to properly evaluate the 
quality of the judgment or judgment process. 
Many repetitions may be necessary to effect 
learning. 

The information search strategies adopted by 
an auditor also may retard the auditor's learning 
from experience. Felix & Waller speculate that 
auditors have the tendency to seek information 
which confirms rather than disconfirms, and 
they argue that this tendency may have a de- 
leterious effect on the auditor's learning from 
experience. This speculation regarding a confir- 
mation search bias is consistent with findings in 
social psychology. Snyder (1981) reports that 
individuals have a tendency to adopt a confir- 
matory information search strategy. That is, 
given a choice, individuals seek information that 
is likely to confirm an initial belief. Snyder also 
, reports that this finding is insensitive to where 
the belief originated, the likelihood of the be- 
liefs accuracy, or incentives for accurate hypo- 
thesis testing. Note, however, that Synder’s re- 
search examined person perception, social in- 
teraction, and stereotyping, and employed stu- 
dent subjects not professional subjects. 

Joyce & Biddle (1981) caution that the results 
found in nonprofessional settings may not 
generalize to professional settings. They suggest 
that trained professionals may use different 
- Strategies when engaged in their professional 
work experiences. The arguments raised by Gib- 
bins (1984) suggest that the search strategies of 
professionals will be different from nonprofes- 
sionals because professionals have well de- 
veloped schema. Gibbins states: 


Because they are generated by a structure which ac- 
cumulates past experience, response preferences will 
display a conservative tendency, that is, they will tend to 
be more stable than is the environment (1984, p. 109). 


A response preference may be considered con- 
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servative if it is similar for a wide range of prob- 
lems of the same type. That is, a conservative re- 
sponse preference will be relatively unaffected 
by changes in the specific factual conditions. An 
example ofa conservative search strategy would 
be a balanced (systematic or mixed) informa- 
tion search strategy developed over years of ex- 
perience wherein disconfirmatory and neutral 
information as well as confirmatory information 
might be sought. The extent of an auditor’s con- 
servative tendency may be a function of the 
stage of script the auditor has developed. The 
most conservative judgments would be ex- 
pected for auditors who have developed a 
generalized or metascript to guide behavior. The 
conservative tendency would not be expected 
to be strong when an auditor only has developed 
on an episodic script. The presence of only an 
episodic script would be expected to result in an 
information search strategy that is more confir- 
matory. This suggests that auditors with more 
experience, who are likely to have developed 
metascripts, will make more conservative judg- 
ments than auditors with less experience, who 
may not have developed such higher-order 
scripts. 

The results of two studies provide some in- 
sight into the role of experience on audit judg- 
ments. In the first study, Biggs & Mock (1983) 
applied verbal protocol analysis to determine 
the process used by four auditors to evaluate in- 
ternal control. The authors report that two 
major patterns of task behavior were found. The 
two more experienced auditors employed a sys- 
temic strategy. Auditors following this strategy 
made a thorough and sequential search of infor- : 
mation prior to making the required audit deci- 
sions. On the other hand, an auditor with less ex- 
perience followed a directed strategy, in which 
information was acquired specifically to make a 
single audit judgment, then information was ac- 
quired to make the second audit judgment. The 
fourth auditor followed a mixed strategy. 

In the second study, Kida (1984b) examined 
the information search strategies of auditors 
concerning the going concern status of a client. 
One group of auditors was instructed to identify 
questions thought to be relevant for judging 
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whether a firm would be viable (e.g., continue in 
operations unaided for two years) and a second 
group of auditors was instructed to identify 
questions thought to be relevant for judging 
whether a firm would fail (e.g., enter bankruptcy 
proceedings within two years). The framing of 
the question was hypothesized to direct infor- 
mation selection (expecting a hypothesis con- 
firming approach). The results showed only li- 
mited support for the use of confirmatory strate- 
gies. Regardless of instructions the auditor chose 
more failure questions as relevant than viable 
questions. This result however may be 
explained by the type of subjects used in the 
study. The auditors were either partners or man- 
agers, all of whom have an extensive amount of 
experience. Such experienced auditors because 
of the presence of well developed metascripts 
are most likely to make conservative judgments. 
This discussion leads to a testable hypothesis: 


H1: There will be a significant interaction effect between 
an auditor's initial belief and years of audit experience, 
such that auditors with more audit experience will have 
more conservative information search strategies than 
less experienced auditors. 


A caveat is necessary here so the reader will not 
misinterpret our discussion. This study is de- 
scriptive only. It is not the authors’ intent to 
imply that within the analytical review task that 
there is one optimal search strategy or what that 
Strategy might be. Interested readers are di- 
rected to Klayman & Ha (1987) for a discussion 
of optimal strategy under various environmental 
conditions. Much work remains before norma- 
tive statements are appropriate. 


METHOD 


An experiment was conducted using practis- 
ing auditors assigned an analytical review task. 
The subjects, task, independent variables, and 
dependent variables are described below. 


Subjects 
The subjects of the experiment were'seventy- 
one practising CPAs from the Phoenix, Arizona 
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offices of each of the “Big Eight” public account- 
ing firms. All were experienced auditors; the 
average experience was cight years. 

The research was executed with complete 
subject anonymity. Selection of audit subjects 
was made by a liaison officer with each firm. 
Questionnaire packets were delivered to the 
liaison officer, who was responsible for the dis- 
tribution and subsequent collection of the sea- 
led instruments. 


Task 

Each participant received a booklet contain- 
ing instructions, the experimental materials, and 
a debriefing questionnaire. The instructions in- 
dicated that the study was concerned with the 
use of analytical reviews for initial audit planning 
and the materials included background informa- 
tion about the client. Following this information, 
the auditor received a three-ratio financial pro- 
file for the prior year (audited), the current year 
(unaudited), and current industry average (au- 
dited). The three ratios, gross margin, current 
ratio, and quick ratio, have been found to be fre- 
quently applied analytical procedures by Daroca 
& Holder (1985). Libby (1985) also used these 
three ratios. The financial profile for the prior 
year and current year of the client showed sub- 
stantial change and was similar to Libby (1985). 
The change was caused by seeding a specific 
error (unrecorded purchase of inventory) and 
negligible variation to the prior year’s state- 
ments. After reading the case materials the au- 
ditors were asked to provide: (1) an initial belief 
for the change in the financial ratios; and (2) the 
information they would seek, in the conduct of 
the audit. 
Independent variables í 

The background information auditors re- 
ceived was manipulated so that the auditors’ ini- 
tial beliefs about the likelihood of error would 
contain a wide variety of estimates. A diversity of 
initial beliefs was believed desirable in order to 
explore the relationship between initial beliefs 


and information search behavior. A review of the ; 


authoritative literature was conducted to iden- 
tify information an auditor might consider when 
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interpreting the results of analytical tests. These 
are: 


(1) Comparison of the financial information with similar 
information regarding the industry in which the entity 
operates. j 

(2) Study of the relationships of the financial information 
with the relevant nonfinancial information (AICPA, 
1984, par. 318.06). 


This study manipulated both categories of in- 
formation. First, the study included two levels of 
information about financial ratios for the indus- 
try. (In all instances the gross’ margin and cur- 
rent ratios for the firm were significantly diffe- 
rent from last year’s figures for the firm.) Under 
the similar condition the financial ratios for the 
industry were substantially the same as the 
client’s current unaudited financial ratios. Under 
the dissimilar condition the financial ratios for 
the industry were substantially different from 
the client’s current unaudited financial ratios. 
Our expectation was that the presence of dis- 
similar industry financial ratios would increase 
the auditors’ initial belief in the client’s financial 
statements containing an error. 

The second factor manipulated was nonfinan- 
cial information. The particular item of nonfi- 
nancial information included in the study was 
management integrity. The auditor’s standards 
of field work state that the scope of the auditor's 
examination would (should) be affected by cir- 
cumstances that raise questions concerning the 
‘integrity of management (AICPA, 1984, par. 
327.06). Further, in their investigation on man- 
agement fraud, Albrecht & Romney (1980) 
found a number of factors concerning the per- 
sonal integrity of the chief executive officer to 
be discriminating between fraud and nonfraud 
clients. Based upon the professional literature 
and the results obtained by Albrecht & Romney 
(1980) two levels of management integrity in- 
formation were included in the study. Under the 
bigh condition, the C.E.O. was represented as 
one of the high reputation, and one who had 
maintained very good relations with both stock- 
holders and employees. Under the low condi- 
tion, the C.E.O. was represented as one of low 
reputation, who was a fashionable but questiona- 
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ble Las Vegas figure who had frequent run-ins 
with legal authorities. Our expectation was that 
low management integrity would increase the 
auditors’ initial belief in the client’s financial 
statements containing an error. 

A third independent variable was the auditors’ 
years of work experience. To assess whether 
years of experience moderated the information 
search behavior of auditors it was necessary to 
have auditors with diversity in audit experience. 
The years of experience for the subjects are 
shown in Table 1. As shown, the subjects in the 
study provide a broad range of experience, with 
a low of two years to a high of over fifteen years 
of experience. Information about audit experi- 
ence was obtained from the debriefing question- 
naire. 


Dependent variables 

The study included two dependent measures: 
(1) the initial belief; and (2) information seeking 
judgments. After reading the experimental 
materials the auditors were first asked the fol- 
lowing question: 


Indicate the relative likelihood for which you believe the 
changes in the current year’s unaudited financial state- 
ment ratios are due to (1) normal year-to-year variation 
and/or to (2) a material error and/or irregularity in the 
unaudited statements. The two numbers you supply 
should sum to 100. 


The second dependent measure concerned 
information seeking judgments made by au- 
ditors. Auditors were given a list of twenty ques- 
tions. Ten of the questions were about errors 


TABLE 1. Subjects’ years of audit experience 











Years of experience Number ofauditors 
2 1 
3 8 
4 9 
5 11 
6 12 
7 7 
8 6 
9 4 
10-15 6 
15+ 7 
Total 71 
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which may have occurred in the financial state- 
ments. The ten errors that were selected were 
based on the results of Libby (1985). That is, a 
question was formed for each of the ten errors 
with the highest rated likelihood of explaining 
the ratio change as found by Libby. An example 
ofa question of this type was, “Were purchase re- 
turns not properly recorded this period?” The 
other ten questions concerned year-to-year vari- 


tions which may have occurred. An extensive ' 


list of questions about the environment were 
generated by a group of auditing professionals. 
From this list a pilot test was used to select the 
ten questions included in the study. An example 
of a question of this type was, “Has inventory 
turnover slowed?” A complete listing is available 
from the authors. 

From the list of 20 questions that were pro- 
vided, each auditor was asked to select the ten 
questions to which he or she would initially seek 
answers in an effort to explain the marked 
change in the financial ratios. Following this, au- 
ditors were asked to rank order the first six ques- 
tions to which they would seek answers. The 
questions were presented in a randomized 
order. 

Two measures of information seeking be- 
havior were developed. The first measure was 
the number of error and/or irregularity ques- 
tions selected. This first measure provides an 
error score. A maximum score is 10 and a 
minimum score is 0. The second measure used 
the first six questions the auditor would seek 
answers to weighted by the ranking given by the 
auditor. Thus, the first question was weighted a 
six and the sixth question was weighted a one. 
The score for the second measure was deter- 


mined by summing the value for each error’ 


question selected. A maximum score is 21 (6 + 
5+4+3+2+ 1) anda minimum score is 0. 


RESULTS 


The results will be presented in two sections. 
The first section will contain results for the first 
dependent variable, the initial belief judgment. 
The second section will provide results on the 
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association between the initial belief and infor- 
mation seeking judgments. 


Inttial belief judgment 
A two-factor analysis of covariance (ACOVA) 


was conducted. Management integrity and in- 
dustry average were each manipulated at two 
levels and the auditors’ years of work experience 
was included as a covariate. As a manipulation 
check for the management integrity manipula- 
tion, at the end of the study, subjects were asked 
to “Indicate the integrity of management on the 
scale below.” A seven point scale was provided, 
with end point anchors labeled “low integrity” 
and “high integrity”. Analyses were conducted in 
the aggregate and separately for each experi- 
ence group. In all cases, analyses indicated that 
the mahagement integrity experimental treat- 
ment significantly influenced the manipulation 
check responses of the subjects. No manipula- 
tion check was provided for industry averages 
because of the difficulty of question framing. 

The auditors’ likelihood estimate that changes 
in financial ratios was due to an error and/or ir- 
regularity in the unaudited financial statements 
was used as the dependent variable. We refer to 
this likelihood estimate as the “initial belief”. 

Table 2 presents the ANCOVA findings and 
Table 3 provides the treatment means and stand- 
ard deviations for the initial belief judgment. 
Several observations may be made concerning 
these results. First, as expected significantly 
stronger initial beliefs were made when the 
results of analytical tests for the firm were not in 
agreement with other firms in the industry and 
when management integrity was low. Second, 
the auditors’ years of work experience also sig- 
nificantly affected auditors’ initial beliefs. An ob- 
served negative coefficient indicates’ that au- 
ditors with less work experience tended to make 
stronger error attributions. Third, the means and 
standard deviations shown in Table 3 indicate 
that the objective of auditors forming a broad 
range of initial beliefs was achieved. 


Information seeking 
Subjects indicated the first ten questions to 
which they would initially seek answers and 
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TABLE 2. ANOVA findings for error attribution score 








` Source Sum of squares df Mean square F-score Probability 

Covariate: 

Experience 2710.1 1 2710.1 8.33 0.005 
Main effects: 

A— Management integrity 3138.8 1 3138.8 9.65 0.003 

B—Industry average 1357.4 1 1357.4 4.17 0.045 
Interaction effect: 

AXB 10.3 1 10.3 0.03 0.859 
Error 20815.2 64 325.2 





, 


TABLE 3. Treatment means and standard deviations for error 








attribution score 
Standard 
Treatment , Mean deviation 
Managementintegrity:Low ~ 36.10 21.40 
High 23.15 14.97 
Industry average: Similar 24.98 13.82 
Dissimilar 34.60 22.60 


then rank-ordered the six questions to which 
they would initially seek answers. However, sub- 
jects who indicated they do not regularly con- 
sider changes in financial ratios were dropped 
from this analysis. Of the seventy-one subjects, 
fifty-seven responded yes to the question, “Do 
you frequently consider changes in the client’s 
financial ratios when first planning the audit?” 
Thus, only fifty-seven subjects were included in 
the information search analysis. — 

The two measures of information search were 
analysed using multiple regression analysis. The 
independent variables for each of two separate 
analyses were the auditor’s initial belief, years of 
work experience, and the interactive element 


(product) of these two factors. We hypothesized 
a significant interactive element in the regres- 
sion equation. In the first analysis, the dependent 
variable was the number of error questions 
selected by the auditor. The results are shown in 
Table 4. In this first analysis (see Table 4) the in- 
itial belief was significant at the 0.06 level and 
the years of work experience significant at the 
0.11 level. The interaction was not significant at 
conventional levels (i.e., 0.15). This result does 
not support the hypothesis. 

Further analysis was conducted to determine 
if there was a general tendency to select either 
error questions or environment questions. Re- 
call that Kida (1984a) found auditors had a ten- 
dency to indicate that failure information was 
more relevant than viable information. Based 
upon the results of Kida one could speculate that 
auditors would be more prone to ask error ques- 
tions than environment questions. This ten- 
dency was not found. Overall, the auditors were 
about equally likely to search for answers to 
error and environment questions. The mean was 
4.86 error questions for all subjects. Table 5 
shows the error questions, by frequency of 


TABLE 4. Results of multiple regression analysis 








* Information Information 
search measure #1 search measure #2 
SRC! T-value Prob. SRC T-value Prob. 
Error attribution (A) 0.49 1.920 0.06 0.76 3.082 0.003 
Work experience (B) 0.38 1.649 0.11 0.51 2.324 0.02 
AXB -0.39 1.442 0.15 —0.62 —2.417 0.02 
Multiple R 0.39 





'SRC = Standardized Regression Coefficient. 
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selection, as included in the first ten questions to 
which auditors would seek an answer. However, 
examination of the number of auditors seeking 
responses to individual questions shows that au- 
ditors were not making random choices. As 
Table 5 shows, forty-five auditors would seek an 
answer to Question 5 whereas only eight would 
seek an answer to Question 4. 

In the second analysis, the dependent variable 
was the rank score of the error questions. Recall 
this measure was constructed by weighting each 
error question according to its ranking. The first 
question was weighted a six and the sixth ques- 
tion was weighted a one. Once again the inde- 
pendent variables were the auditor's error at- 
tribution, years of public experience, and the in- 
teraction of these factors. The results are shown 
in Table 4. In the second analysis all three factors 
are significant. To investigate the significant in- 
teraction a separate regression was conducted 
for the group of subjects with limited experi- 
ence (ie. less than six years) and the group of 
subjects with extensive experience (ie. six 
years or more ). For each regression the indepen- 
dent variable was the error attribution score and 
the dependent variable was the rank score. The 
results of the two regressions are shown in Fig. 1 
and Table 6. As shown, the information search 
behavior for the more experienced subjects 
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Ranked score (internal items) 





O% 20% 30% 40% 50% 60% T0% 80% 90% O% 
Initia hypothesis - probability 
of error /Irregularity 


Fig. 1. Initial hypothesis — probability of error/irregularity. 


(those with 6 years or more experience) did not 
correlate with the initial belief. That is, the initial 
belief was not a significant variable in the model. 
The regression equation for this group of experi- 
enced subjects is, score = 8.17 — 0.01 Cinitial 
error belief). The multiple R was 0.03. On the 
other hand, for the auditors with 5 years experi- 
ence or less, the information search behavior 
was directly related to the initial belief. The form 
of this association between the initial belief and 
information search is shown in Fig. 1. As shown, 


TABLE 5. Number of auditors (7 = 57) seeking error questions 








Number of auditors 




















Question seeking question 
1 Were payments on account made next period improperly recorded this period? 17 
2 Have proceeds from new long-term debt been used to pay off substantial amounts of short term 
credit obligations? 19 
3 Were purchase returns not properly recorded this period? 22 
4 Was merchandise that was retumed by customers this period improperly recorded this rather 
than next period? A 8 
5 Were accrued expenses not properly recorded this period? 45 
6 Have some expense items been improperty recorded as inventory? ` 31 
7 Have marketable securities not been properly adjusted to lower of cost or market? 26 
8 Have some of next period’s sales been improperly recorded this year? 41 
9 | Were bad debt expenses not properly recorded this period? 38 
10 Have purchases made and received this period not been recorded or improperly recorded next 
period? 30 
Overall average 4.86 
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TABLE 6. Results of regression analyses within experience groupings 





Subjects with greater experience: 














Variable Coefficient “src! T-Value FScore Prob. 
Initial belief —0.034 —0.19 0.038 0.847 
Intercept 8.714 

MultipleR ` 0.034 

Subjects with less experience: 

Variable Coefficient SRC T-Value F-Score Prob. 
Initial belief 0.53 2.72 7.43 0.013 
Intercept 4.749 

Multiple R 0.53 





ISRC = Standardized Regression Coefficient 


as the error or irregularity initial belief in- 
creased, the error rank score also increased. The 
regression equation for this group is, score = 
4.74 + 0.16 (initial error belief). The multiple R 
was 0.53. This pattern of information search is a 
confirming strategy. That is, following a high 
error or irregularity initial attribution, auditors 
with limited experience tended to seek out in- 
formation about errors which would possibly 
confirm this initial attribution. The pattern of 
results for this dependent measure supports the 
hypothesis. ` . 


DISCUSSION 


A study was conducted to provide evidence 
on the initial planning judgments of auditors. 
The study may be viewed primarily as explorat- 
ory in that the existing auditing literature con- 
tains little evidence on the information seeking 
judgments of auditors. Audit subjects in the 
study were given information about the results 
of analytical tests. In response to this informa- 
tion, auditors were asked to provide an initial be- 
lief on the likelihood of the financial statements 
containing an error and to indicate the informa- 
tion they would initially seek. The hypothesis 
that was tested stated that the extent of the audit 
experience would moderate the association be- 
tween an initial belief and information seeking 
judgments. Specifically, the expectation was that 


lesser audit experience would be associated 
with the initial belief having a greater positive 
correlation with information seeking judgments. 

Before discussing the results of the study, sev- 
eral limitations will be noted. First, since the sub- 
jects were not randomly selected, inferences 
cannot be made to auditors in general. Since the 
subjects were selected by a liaison officer with 
each participating firm, it is possible that the 
selection process contained systematic bias. Sec- 
ondly, as the brief scenarios did not include all 
factors potentially relevant for the analytical 
review task, the results may not generalize to 
other settings. Still, the authors are confident 
that the scenarios contained sufficient informa- 
tion about the audit environment for testing the 
theoretical propositions concerning informa- 
tion search behavior. 

Based on a review primarily of the profes- 
sional auditing literature, the information given 
to audit subjects included manipulations of two 
factors (e.g, management integrity and industry 
information) to facilitate auditor subjects form- 
ing diverse initial beliefs. We will first provide 
several observations concerning auditors’ initial 
beliefs. 

Auditors’ initial beliefs regarding the likeli- 
hood that the financial statements contained a 
material error was significantly effected by the 
management integrity information. Auditors’ ini- 
tial belief of an error was higher when manage- 
ment integrity was low. This finding is conson- 
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ant with the professional auditing literature. This 
was the case across both sets of auditors — those 
with greater experience and those with lesser 
experience. 

Information concerning industry averages 
was also found to significantly affect auditors’ in- 
itial beliefs of an error. Initial beliefs of an error 
were found to be higher when the results of 
analytical tests for the client were similar to in- 
formation for the industry. We believe this find- 
ing to be of some significance because it repre- 
sents, to our knowledge, the first time the effect 
of industry information on auditor judgment has 
been investigated. The results indicate that au- 
ditors employ this information in the manner 
that was predicted based upon the professional 
auditing literature. 

Finally, auditors’ initial beliefs were systemati- 
cally associated with the extent of work experi- 
ence. Auditors with less work experience 
reached initial error beliefs which tended to be 
higher than auditors with more experience. We 
may speculate as to why this result occurred. A 
major in accounting typically includes a re- 
quired course in auditing. Although this course 
varies for each accounting program the nature of 
this course generally focuses on the process of. 
obtaining sufficient, competent evidence on 
which to base an opinion. The course generally 
stresses how audit evidence is useful in detect- 
ing errors and may include a course project in 
which students actually detect financial state- 
ment errors. This type of education may lead stu- 
dents to form unrealistically high error rates for 
financial statements. Gradually as auditors gain 
experience and get exposure to an increased 
number of financial statements which have a li- 
mited number of financial statement errors they 
learn to revise their error beliefs downward. 

Turning to a discussion of the information 
search strategies, the results for the two informa- 


549 


tion search measures were slightly different 
from.one another. The results for the second 
measure of information search, the ranking of six 
questions, supported the hypothesis. The group 
of more experienced auditors followed a ba- 
lanced information search strategy uncorrelated 
with initial hypotheses. This suggests that for 
this group their choice of questions to which to 
initially seek answers was guided by a well de- 
veloped cognitive structure. This structure not 
only provided a basis for interpreting and mak- 
ing sense of analytical test results but also pro- 
vided a basis for behavior. On the other hand, for 
the group of subjects with less experience, their 
information search judgments were significantly 
and positively related to the initial beliefs that 
had been made. These findings provide evidence 
that audit experience acts to moderate the infor- 
mation seeking judgments of auditors and that 
more experience is associated with more con- 
servative information seeking judgments. 

Considering the results from the first measure 
of information search behavior (e.g., the number 
of error questions selected) the results showed 
that the initial error belief affected the number 
of error questions to which auditors would ini- 
tially seek answers. The interaction between 
years of experience and initial belief was not sig- 
nificant at traditional levels, however. Still, the 
direction of the interaction was similar to the 
second measure. Perhaps having auditors select 
ten unranked questions represented a less pre- 
cise or noisier measure. It may be the case that 
auditors when considering the results of analyti- 
cal review tests have a relatively short list of five 
or six questions and not ten. If this is the case the 
inclusion of more questions may just be adding 
noise. This possibility is supported by lower 
multiple R observed for the search medsure 
based on ten questions. 
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THE EFFECT OF AUDIT DOCUMENTATION FORMAT ON DATA COLLECTION* 
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Abstract 


This study considered the impact of three different methods of documenting internal accounting control 
systems (internal control questionnaire, flowchart, and narrative memorandum) upon the quantity and 
type of data collected during the auditor's preliminary review of internal control. The effect of experience 
upon data collection was also considered. Fifty-five auditors participated in a laboratory experiment in 
which they collected data about a client’s control system, and recorded it in one of the three formats. 
Results indicated that subjects using an internal control questionnaire collected far more data than did 
other subjects. When using a questionnaire or flowchart, seniors collected more data than less experienced 
auditors. The type of data collected was shown to be associated strongly with documentation format but 
not with experience. Evidence that selective perception was induced by documentation format was 


particularly strong for the questionnaire subjects. 


A review of the client’s internal control system is 
normally conducted where the auditor has been 
engaged to express an opinion on the financial 
statements. A preliminary review of internal ac- 
counting control is performed after the client’s 
internal control environment has been studied 
and a tentative decision has been made to rely 
upon the internal accounting control system 
(AICPA, 1983, AU s. 320.01 ). The review has im- 
portant implications for audit design (Cook & 
Winkle, 1980). Practitioners rated internal con- 
trol as the most influential factor in the planning, 
implementation and evaluation of an audit o Gib- 
bins & Wolf, 1982). ° 

The preliminary evaluation of internal ac- 
counting control comprises three tasks: (1) col- 
lection of aroraa about the client’s internal 





accounting control system; (2) documentation 
of that information in audit working papers and 
(3) evaluation of the control system. Several 
different approaches to the documentation of 
control systems have received official sanction. ' 
Of these, three formats — internal control ques- 
tionnaire, flowchart, and narrative memoran- 
dum — are widely used throughout the profes- 
sion (Cushing & Loebbecke, 1986). Each format 
has different advantages and disadvantages from 
an audit firm’s perspective (see Table 1). Once 
developed, an internal control questionnaire is 
the easiest of the three formats to complete. For 
that reason, less experienced auditors are often 
assigned to document the client’s system by fill- 
ing out an internal control questionnaire. Staff 
require careful training to prepare flowcharts 


“This paper is based upon my dissertation completed at Columbia University. Helpful advice was received from Jan Bell and 
Gordon Shillinglaw in particular, and also from an anonymous reviewer and participants in workshops at the University of 
Washington and the University of Southern California. Financial support from Columbia Universiuty and the University of 


Hong Kong Is gratefully acknowledged. 


The Auditing Standards Board of the AICPA has authorized the use of internal control questionnaires, flowcharts, narrative 
memoranda, decisions tables and other formats which auditors find useful (AICPA, 1983, AU 3.320.53). While specialized 
formats have recently been introduced by some of the “Big Eight”, these were not included in the study. 
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TABLE 1. Summary of recording format characteristics 





Characteristics 





























Advantage Disadvantage 
Internal control questionnaire 
Pre-printed Cheapness and ease of preparation Very generalized 
Standardized ` Facilitates quality control across Lack of flexibility 
both audit personnel and clients 
Binary response mode Ease and rapidity of completion Difficult to determine the extent 
(elaboration is necessary only for and accuracy of work done by 
exceptions) subordinates 
Flowchart 
Sequential flow Ensures that controls at function Contains much detail that is not 
boundaries are not overlooked relevant to auditor 
(Mock & Willingham, 1983) 
Tailor-made Reflects perceived internal Time-consuming to prepare 
accounting control system of 
z audit client 
Graphic presentation Facilitates review process Requires some training 
Narrative memorandum f 
Tailor-made Reflects internal accounting Not standardized, and may be 
control system of audit client poorly structured 
with considerable detail (Carscallen, 1982) 
Very detailed Guards against perfunctory data " Very time-consuming 
i collection 
Analytical Facilitates the evaluation process Requires considerable training 
for effective use 


and considerable experience is necessary to 
draft an effective narrative memorandum. 
_ Each format considered in this study imposes 
a different degree of structure on the data collec- 
tion and documentation tasks. Typically, ques- 
tionnaires are highly structured, presenting the 
. auditor with.a pre-printed list of questions to 
_ answer. Omissions are easy to identify, but a re- 
viewer cannot easily discern the accuracy of the 
responses. While questionnaires are designed.to 
be comprehensive, they do not always reflect all 
features of a client’s control system. There is the 
potential that aspects ofa client’s system may be 
overlooked. Flowcharts have a logical sequential 
structure that is a reflection of the client’s inter- 
nal control system. However, an auditor must as- 
certain what documentation is originated and 
how it is processed in order to depict it in a flow- 
chart. Narrative memoranda are structured by 
the individual auditor, who defines the goals of 





the audit task and designs a data collection 
strategy. 

Until the pioneering work of Peat, Marwick, 
Mitchell & Co. (Mock & Willingham, 1983), and 
Deloitte, Haskins and Sells (Holstrum, 1984), 
the function and effect of alternative formats for 
documenting internal accounting control sys- 
tems were largely unexplored in the profes- 
sional and academic auditing literatures. Mock & 
Willingham noted that: ` 


.., there were no explicit professional guidelines and 
standards for documenting accounting controls. Under- 
lying theory was sparse, and little behavioral research 
had been done on how individuals make judgments on 
controls (1983, p. 92). 


In previous audit research, little attention has 
been directed toward examining the role of data 
collection’ in the evaluation of internal control 
systems (see Libby, 1981; Felix & Kinney, 1982 


While two accounting studies (Pankoff & Virgil, 1970 and Abdel-Khalik & El-Sheshai, 1980) incorporated data collection in 
research tasks, neither study explicitly formulated hypotheses relating to data collection. 
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and Gibbins, 1984). Most studies provided the 
subject with a limited data set, typically contain- 
ing eight or fewer cues (see the Ashton group of 
studies, 1974a, 1974b, 1980; Hamilton & 
Wright, 1982 and Gaumnitz et al, 1982), but of- 
fered “little or no opportunity for additional in- 
formation search” (Knechel & Messier, 1988, p. 
1). Further, these studies focused, almost exclu- 
sively, on the evaluation phase of the internal 
control review. 

The purpose of this paper is to examine the 
impact of the three formats upon the data collec- 
tion phase of the internal accounting control sys- 
tem review. The format chosen for the 
documentation of the client’s system may affect 
the quantity and type of data collected. Further, 
since documentation of the client’s internal ac- 
counting control system is often assigned to the 
more junior members of the audit team 
Gohnson, 1981), this study also considers 
whether the quantity and type of data collected 
is affected by audit experience. These issues are 
of importance because the depiction of the 
client’s system in the audit records forms the 
input data set to the evaluation of internal con- 
trol strength.> 

The remainder of this article is organized as 
follows. The second section reviews relevant lit- 
erature and develops the broad hypotheses to be 
investigated. Next are described the research 
design, the research instruments, and the ad- 
ministration of the experimental task. The 
results are then summarized and the final section 
discusses the implications of the key findings for 
auditing and audit research. 


THEORETICAL DEVELOPMENT 


The auditor, in reviewing the client’s internal 
accounting control system, examines standard 
operating procedures and transaction documen- 
tation, and interviews key members of the or- 
ganization. In so doing, the auditor necessarily 
enacts his or her own environment. Enactment 





*The problem of the validity of the representation of the client’s system in the audit reco 
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is an interactive process employed in “making 
sense” of the environment whereby information 
is selected for consideration (Weick, 1979). If 
new information is congruent with prior knowl- 
edge and/or expectations, sampling ceases. If 
not, more information is selected until the 
phenomenon is understood and integrated with 
existing perceptions. This interactive process 
may result in the modification of the original set 


of expectations, or an updating of the knowledge 


base, as appropriate (Neisser, 1976; Waller & 
Felix, 1984a). 
Since the process of enactment affects the 
selection of information, audit firms have de- 
veloped standard operating procedures to guide 
the data collection and documentation process. 
Tasks may be structured in different ways and to 
different degrees to ensure a homogeneous 
audit product. Firms provide specific instruction 
and on-the-job training intended to “mitigate the 
effects of individual human differences” (Weber, 
1978, p. 371). The adoption of standardized, 
pre-printed forms, guides or checklists; the in- 
troduction and enforcement of approved stand- 
ard operating procedures and the development 
of audit manuals and other reference works are 
all methods of standardizing the audit product. 
Standardization offers many advantages. In 
general, standard operating procedures help 
control the level of ambiguity and uncertainty in 
an organization’s information flow (Weick, 
1969, p. 40; Ijiri et al, 1966), thus easing the 
communication burden (March & Simon, 


` 1958). They also facilitate staff training and are 


important tools of quality control (Cushing & 
Loebbecke, 1986, p. 42). In standardizing proce- 
dures for documentation of a client’s internal 
control system, the audit firm hopes to ensure an 
audit product that is a reliable input into the 
evaluation process, that is cost-effective and in 
compliance with professional requirements. 
Cushing & Loebbecke also observed that “struc- 
tured techniques may reduce the time required 
to perform many of thé more routine tasks in- 
volved in an audit” (1984, p.10). | 


is exacerbated in those audit firms 


where the system is documented by one person, often an inexperienced auditor, and iṣ'evaluated by another. However, this 


issue is not explored in this study. 
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Given the extent of competition in the audit 
‘ world, there is considerable emphasis on cost-ef- 
fective auditing. Further, concerns about high 
staff turnover, professional liability and the im- 
pact of peer review promote increasing formali- 
zation of audit procedures (Dirsmith & McAllis- 
ter, 1982b). However, standardized procedures 
imay also have disadvantages. For example, Mar- 
tin cautioned: 


A standard operating procedure is a schema that struc- 

tures dealing with an environment. A standard operating 

. ‘procedure is a frame of reference that constrains explora- 

' tion and often unfolds like a self-fulfilling prophecy 
(1977, p. 82). 


Dirsmith & McAllister (1982a) expressed a con- 
cern that increased formalization might lead to 
“mechanistic” audits and hamper an audit of 
clients with a more open and adaptive structure. 
Cushing: & Loebbecke noted that the use of 
structured formats may induce the auditor to: 


become mechanistic in his or her thinking. This could 
cause the auditor to fail to observe important facts, or to 
fail to reason through to appropriate judgments and con- 
clusions (1986, p. 43). 


In summary, documentation format plays an 
important organizational role as a standard 
operating procedure in that it not only standar- 
dizes audit working papers but also imposes a 
structure on the data collection process. The 
three most commonly used formats for docu- 
menting internal accounting control systems — 
internal control questionnaire, flowchart and 
narrative memorandum — differ in the degree to 
which each explicitly structures the data-gather- 
ing and documentation task. Yet task structure 

“may affect several parameters of the data collec- 
` tion process, including quantity and type of data 
actually gathered. Thése aspects are considered 
below. . 


Quantity of data collected : 
Each documentation format emphasizes diffe- 
‘rent aspects of the client’s internal accounting 
control system. For example, questionnaires 
tend to be very comprehensive, covering all key 
procedural, organizational and informational 
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controls (Defliese et al, 1975). Narrative 
memoranda, if well designed, may be com- 
prehensive in scope yet may concisely describe 
the audit objectives and the client’s controls that 
meet those objectives (Martin, 1981). Finally, 
flowcharts tend to focus on document-related 
procedural and organizational controls, with 
considerable attention to detail (Mock & Willin- 
gham, 1983). Thus the scope of each format may 
affect the quantity of data collected, with sub- 
jects using the questionnaire collecting the most 
data and flowchart subjects obtaining the least. 
The foregoing discussion suggests the following 
hypothesis: 


H1. Questionnaire users will collect more data than nar- 
rative users, who, in turn, will collect more data than 
flowchart users. 


Selective perception in data collection 

To date, there are no published studies exa- 
mining the relationship between task format and 
data collection in auditing. Researchers in con- 
sumer information processing contrasted three 
different information presentation formats and 
hypothesized: l 


... that information presentation format affects informa- 
tion acquisition patterns. A competing hypothesis .. . 
(was) that consumers have a preferred strategy for ac- 
quiring information, which they apply to the information 
they receive, regardless of the format of that information 
(Bettman & Kakkar, 1977, p. 234). 


Their results strongly supported the format 
hypothesis. However, Bettman & Zins (1979) 
and Capon & Burke (1980) both questioned 
these findings. The former suggested that format 
differences may occur only in tightly structured 
tasks. The latter, noting that some of Bettman & 
Kakkar’s subjects persisted in using a processing 
strategy inconsistent with the manner in which 
the data was presented, argued that individuals 
may have a preferred strategy, but “the observed 
strategies . . . result from an interaction of the 
preferred strategies with individual/product 
class variables . . . and task related factors” (p. 
315). ` i 

Another experimental example of format-in- 
duced pitfalls is provided by Fischhoff et al. 
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(1978). They presented their subjects, includ- 
ing professional auto. mechanics, with a variety 
of information and a diagnostic aid for determin- 
ing the ostensible cause of a vehicle’s break- 
down. The aid was a “fault tree”, a comprehen- 
sive checklist structured to eliminate non-prob- 
able causes from further analysis. The authors 
noted that the subjects, including experts, relied 
heavily on the diagnostic aid and failed to recog- 
nise possible causes of malfunction that were 
not addressed by the fault tree. 

A documentation format, whether it be 
explicitly structured, as is an internal control 
questionnaire or decision table, or implicitly 
structured, as are flowcharts and narrative 
memoranda, is like a schema,’ guiding the au- 
ditor through the maze of information that is the 
client’s internal accounting control system. 
Neisser explained it thus: 


The schema accepts information as it becomes available 
at sensory surfaces and is changed by that information; it 
directs movements and exploratory activities that make 
more information available, by which it is further mod- 
ified ... Information that does not fit such a format goes 
unused, Perception is inherently selective (1976, p. 54— 
55, emphasis supplied). 


If selective perception is induced by 
documentation format, then subjects may be ex- 
pected to collect only data listed in or prompted 
by the documentation format employed. For 
example, questionnaire users may attend to cues 
listed in the questionnaire, but may overlook 
others that are not. Flowcharts tend to focus on 
the organizational and procedural controls per- 
tinent to the various documents in the client's 
system — even though these may not be rele- 
vant to the audit objectives (Mock & Willin- 
gham, 1983) — overlooking informational con- 
trols and procedural and organizational controls 
that may be important to the audit objective, but 
which are not directly related to the documenta- 
tion. 

The narrative memorandum may be struc- 
tured almost entirely according to the wishes of 
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the auditor. The focus could be, for example, on 
specific controls, personnel, or sequential 
documentation flow. It is not possible to predict 
which cues will be selected for consideration by 
subjects using the narrative memorandum, as 
this format has no structure other than that im- 
posed by the preparer. The auditor may choose 
to develop a narrative memorandum that is, for 
example, a verbal equivalent of a flowchart or an 
analysis of controls by department. Memoranda 
may also differ with respect to the amount of de- 
tail recorded. 

In summary, items may be examined if promp- 
ted by the documentation format, and may not 
be otherwise. This analysis suggests the follow- 
ing hypothesis: 


H2. Users of documentation formats will collect only data 
prompted by their formats. 


The impact of training and experience on data 
collection 

While many of the studies noted earlier consi- 
dered the impact of experience on internal con- 
trol evaluations, very few auditing studies have 
systematically examined the effects of training 
and experience on task performance (Waller & 
Felix, 1984b). Yet audit firms are aware of the 
differential impact of staff skill levels Johnson, 
1981). In practice, an audit team is usually com- 
prised of a number of staff with varying degrees 
of experience. Jaenicke discussed the traditional 
division of labor. between audit junior and 
senior: 


The audit team seeks to identify exceptions. The primary 
role of preliminary identification is assigned to the mem- 
bers with the least amount of experience with the client’s 
accounting system and with the auditing process 
Gaenicke, 1980, p. 72). 


The consequences of this division of labor have ` 
not been examined in the academic auditing lit- 
erature. 

The consumer information processing studies 
of Bettman & Park (1980) and Moore &- 


‘Hastie defined “schema” to include “almost any of the abstract hypotheses, expectations, organizing principles, frames, 
implicational molecules, scripts, plans or prototypes that have been proposed as abstract mental organizing systems or 


memory structures” (1981, p. 39). 
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Lehmann (1980) showed that in relatively sim-. 


ple tasks (compared with an audit) — the selec- 
tion ofa loaf of bread or a microwave oven — the 
relationship between a knowledge and experi- 
ence factor and the quantity of data collected 
was curvilinear. With low knowledge levels, lit- 
tle data was collected. As knowledge increased, 
so did the quantity of data collected. Finally, at 
higher experience levels, the amount of data 
sought declined. Although the authors did not 
discuss this, the latter phenomenon might be 
explained in terms of an enhanced appreciation 
of key cues and a recognition of redundancy in 
the data set. In complex auditing tasks, Mock & 
Turner (1981) and Biggs et al. (1988), found an 
increasing relationship between knowledge 
level and quantity of data collected, but the lat- 
ter study used only four subjects. Finally, Weiser 
& Shertz (1983) and Biggs et al (1988) argued 
that experts organized their data differently 
from novices. As noted by Larkin et al: 


The most obvious difference between the expert and 
novice is that the expert knows a great many things the 
novice does not know and can rapidly evoke the particu- 
lar items relevant to the problem at hand ( 1980, p. 1336). 


In summary, more experienced auditors may 
have a better understanding of the task and be 
more familiar with the elements of a well-de- 
signed internal accounting control systems than 
less experienced auditors. They would draw 
upon this knowledge in designing and in imple- 
menting a data search strategy (Oatley, 1978, p. 
208). This discussion suggests the following 
hypothesis in the context of this study: 


H3. The more experienced the auditor, the more data 
~ will be collected. 


More experienced auditors may also be more 
alert to potential weaknesses in the system and 
~ less dependent on the documentation format.as 
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a prompting mechanism. This proposition is in 
accord with the observation of Larkin et al. that: 


recognition of a pattern often evokes from memory 
stored information about actions and strategies that may 
be appropriate in contexts in which the pattern is present 
(1980, p. 1336). 


This suggests the following hypothesis: 


H4. More experienced auditors wil! collect different 
types of data than less experienced auditors. - 


RESEARCH DESIGN 


In order to test the principal hypotheses, a 
laboratory experiment was conducted with 55 
practicing auditors. Six CPA firms in the north 
eastern United States contributed subjects with 
one to three years of audit experience that in- 
cluded exposure to documentation of internal 
accounting control systems in manufacturing 
companies. l 

The results of the experiment were analysed 
in a two-way analysis of variance framework. The 
key dependent variables were the quantity and 
type of data collected. The two independent var- 
iables examined were the documentation for- 
mat and the level of the subject's audit experi- 
ence. Thus, the randomized block design 
employed was a mixed model, with levels of 
treatment, the three documentation formats, as 
fixed effects. Blocking the subjects on the basis 
of their experience was equivalent to a random 
effect, since different subjects used in a replica- 
tion would have different levels of experience 
(Kirk, 1982, p. 293). 

To capture experience, Ashton & Kramer 
(1980) and Gaumnitz et al (1982) used the 
length of employment. As this approach tends to 
misclassify “late starters” and “young geniuses”, 
this study classified subjects ex post facto’ into 
the categories of “junior”, “semi-senior” and 
“senior”, based upon their scores on an experi- 


*To have done otherwise would have required the subjects to participate in two experimental sessions — the first to supply 
biographical data and the second, held after their responses had been analysed, to perform the experimental tasks. A two-stage 
approach might have alerted subjects to the fact that they were participating in an experiment. Further, such an approach 
could have reduced the firms’ willingness to participate in the experiment. 
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ence index. This construct (Kerlinger, 1986) 
was developed from the subjects’ responses to 
four groups of questions about educational back- 
ground,’ and seven groups of questions about 
work experience and participation in training 
programs, contained in a 30 item debriefing 
questionnaire. 

Subjects were randomly assigned to 
documentation formats. The randomization was 
effective, with no association between 
documentation format and experience detected 
(chi-square = 2.02 with 4 df, p < 0.73). 
Further, the chi-square value suggests that, al- 
though the cell sizes were unequal, they did not 
pose a threat to the analytical strategies 
employed. 

Following the recommendations of earlier re- 
searchers (Hogarth, 1974; Brehmer, 1976; 
Phelps & Shanteau, 1978), the stimuli and task 
were designed to approximate the real world 
task in as many dimensions as possible. Audit 
case materials were developed from a descrip- 
tion ofa real audit client engaged in manufactur- 
ing; supporting documentation normally found 
on an audit engagement was provided; the par- 
ticipants were all professional accountants cur- 
rently using audit skills similar to those studied 
in the experiment; and the subjects were re- 
quired to initiate their own search for data, 
rather than react to a highly-structured cue set. 


Administration of the task 
The task required the subjects to perform the 
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following operations in a preliminary evaluation 
of the internal accounting control system of the 
revenue transaction cycle? of an audit client. 


(1) Read the introduction to the audit engage- 
ment, and review the draft audit program. 


(2) Review extracts from the client’s perma- 
nent file.” 


(3) Collect data about the internal accounting ` 
control system by requesting information from 
the client. 


(4) Complete a standard internal control 
questionnaire, or prepare a flowchart or a narra- 
tive memorandum, according to the treatment 
assigned. 


Ten separate sessions, each with five or six 
subjects, were conducted. Four hours were al- 
lotted for the experiment,'° including debrief- 
ing. Subjects received an experimental packet 
containing background materials and approp- 
riate stationery. These packets were identical ac- 
ross treatment groups, except for one differing 
sentence in which subjects were asked either to 
complete a questionnaire, or to prepare a flow- 
chart or a narrative memorandum. 

Normally, the auditor would interrogate the 
client’s personnel in order to obtain informa- 
tion. Instead, in the experiment, subjects wrote 
questions on a “data request card”, for submis- 
sion to the experimenter or assistant. The re- 
quest was time-stamped and a time-stamped data 


. card was returned to the individual. 


To have conceptual validity, a new construct should be consistent with known methods. There was a very high degree of 
association between the experience index scores and months of service (Kendall’s tau b = 0.728, p = 0.000), with a 89.1% 

of the cases being classified in the same manner. The few differences were found on the junior—semi-senior border or on the 
semi-senior—sentor border. On average, those classified as juniors in this study had worked as auditors for 13.8 months. Semi- 
seniors had 25.5 months and seniors had 35.4 months of audit work experience. 


Hiag “education” was included in the construct, this component did not prove useful in sda between the 
three levels of experience, perhaps because of the strict and fairly uniform education requirements imposed by State Boards 


of Accountancy. 


‘The revenue transactions cycle was selected because of its presumed familiarity to the least experienced subjects. For 


similar reasons, Joyce (1976) couched his study of judgment in audit program planning in an accounts receivable framework, ae 


as did Mock & Turner (1981) and Biggs & Mock in their study of audit scope decisions (1983). 


Similar documentation was provided by Mock & Turner to their audit seniors and supervisors (1981, p. 43). 


‘Oparlier studies were much less generous with time. Mock & Turner, for example, required two hours (1981, p. 58). 
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RESULTS (a: = 0.158). The finding that seniors col- 
lected more data than those with less experi- 
Quantity Pikes ence is consistent with the results of Biggs et al. 


` To test the association between data anny 
and documentation format and level of experi- 
ence, a two-way analysis-of-variance (ANOVA) 
was performed, with the number of data cards 
collected as the dependent variable. Kirk recom- 
mended that the full rank experimental design 
model be adopted to analyse experiments with 
unequal cell sizes (1982, p. 420). The SPSS MAN- 
OVA procedure (Hull & Nie, 1981) “deals au- 
tomatically with unorthogonality produced by 
unequal samples sizes” (Tabachnik & Fidell, 
1983, p. 259) and was employed to test all the 
univariate and multivariate analyses-of-variance. 


An inspection of the marginal means in Table 
2 reveals that the number of data cards collected 
varied with documentation format in the man- 
ner hypothesized (H1). As hypothesized, sub- 
jects using the internal control questionnaire 
collected the most data, on average, and flow- 
chart subjects collected the least. The ANOVA 
confirms that this result is significant (F = 7.47, 
p < 0.002). In the time frame of this experiment, 
questionnaire subjects collected 32.8% and 
24.9% more data cards than their flowchart and 
narrative counterparts, respectively. The narra- 
tive subjects collected only 6.3% miore data 
cards than flowchart subjects. The association 
between the quantity of data collected and 
documentation format was quite sone (Ma = 
0.245)."" 


The seniors collected about 22% more data 
on average than the juniors and semi-seniors, 
who collected about 23 cards each. The effect of 
experience was particularly strong in the ques- 
tionnaire and flowchart formats, in which 
seniors collected 35% and 28% more data than 


did their less experienced counterparts. The - 


strength of association between data collection 
and the blocked experience level was moderate 





(1988), and, in general, supports the hypothesis 
(H3). 

There was little difference between the 
juniors and the semi-seniors in any format. 
Further, the level of experience apparently had 
no effect on the quantity of data collected by 
subjects using a narrative memorandum. These 
findings may suggest that a certain period of time 
must elapse before one becomes accomplished 
at different audit tasks, and that period exceeds 
two years in the case of the questionnaire and 
flowchart. The narrative memorandum, having 
the least explicit structure, may require an even 
longer “training” period. 

The internal control questionnaire dominated 
the other formats at every level of experience. In 
fact, juniors using the questionnaire collected 
more data than did seniors using either the flow- 
chart or the narrative memorandum. [Thus the 
interaction between experience and format was 
not statistically significant (F = 1.03, p 0.401).] 
The results also suggest that assignment of less 
experienced staff to the less structured formats 
may be costly to the audit firms. 


Selective perception bypotheses 

The second pair of hypotheses considered 
whether documentation format (H2) and ex- 
perience (H4) directed attention towards or 
away from certain data items. To operationalize 
a test of the attention-directing hypotheses, the 
client’s system was divided into 15 different con- 
trol areas (e.g. sales orders, receipts, sales terms, 
bad debts, etc.). The number of cards requested 
per control area was subjected to a multivariate 
analysis-of-variance; using Pillai’s criterion.'? 
The multivariate analysis showed a very strong 
effect for format (F = 3.433, p = 0.000), but no 
effect for experience (F = 0.873, p = 0.651). 

The questionnaire subjects obtained much 


HAs the cell sizes are unequal, 77, (alternative eta squared) is employed to measure the strength of association between the 
dependent and independent variables (Tabachnik & Fidell, 1983, p. 47). 


"“Pillal’s criterion was adopted as a measure of the statistical significance of multivariate F, as it is more robust than Wilks’ 
lambda, Roy's greatest root and Hotelling’s trace (Tabachnik & Fidell, 1983, p. 249). 
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TABLE 2. Number of data cards collected by documentation format and experience 











Experience Documentation format l 
ICQ FC NM Total 
Junior Mean 26.33 20.00 22.29 23.48 
(Subjects) (9) G) (7) > Q) 
Semi-senior Mean 26.00 19.86 24.17 22.82 
(Subjects) (4) (7) (6) (17) 
Senior Mean 35.00 25.67 23.40 ` 28.30 
(Subjects) (6) (6) (5) (17) 
Total Mean 29.00 21.83 23.22 24.76 
(Subjects) (19) (18) (18) (55) > 








Where: ICQ = internal control questionnaire; FC = flowchart; NM = narrative memorandum. 
The numbers in parentheses indicate the number of subjects in each cell. 


ANOVA of data cards by documentation format and experience 




















Source af. Sum ofsquares Mean square F Significance 
Format 2 538.32 269.16 7.47 0.002 
Experience 2 311.13 155.56 4.32 0.019 
Format by experience 4 148.38 37.21 1.03 0.401 
Constant , 1 33,728.07 33,728.07 935.96 0.000 
Within cells 46 1,657.65 36.03 
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data (mostly on organizational and information 
controls) that in general was not sought by 
people using flowcharts and narrative 
memoranda. Spearman correlations of data card 
use by all subjects show that the flowchart and 
narrative subjects collected very similar data, 
primarily focusing on procedural controls (see 
Table 3, panel A). (These results are also de- 
picted by the Venn diagram in Fig. 1). This find- 
ing supports the view that many individuals con- 
ceptualize the narrative memorandum as a ver- 
bal version ofa flowchart, with perhaps a further 
degree of detail and greater analysis. 

While the multivariate test did not support the 
experience hypothesis overall, in three control 
areas where the revenue transaction cycle cros- 
sed functional boundaries (inventory control, 
production and marketing), the juniors col- 
lected much more data, including results of “in- 
terviews” with accounting personnel, than did 
the seniors. Perhaps the seniors collected suffi- 
cient data to satisfy their audit objective in these 
areas, whereas the juniors over-audited. The 
finding is consistent with Carscallen’s observa- 
tion that junior staff: 


may well also spend too much time on things they under- 
stand and overlook the difficult and tricky points that are 
often the key to a good audit (1982, p. 22). 


The literature also suggested that information 





Internal Control 











Note: The values represent the number of cards 
selected by at least three subjects usieg that 
Format. Nineteen cards were chosen by only 
one or (wo subjects in each format are not 
shown in the a figure. A further nine 
were never selected. 


Fig. 1. Data card use by documentation format. 
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TABLE 3. Spearman correlations of data card use by documentation format and experience 


Panel A: 








Panel B: 
Narrative 
' Flowchart memorandum Semi-senior Senior 
ICQ 0.51t 0.48t Junior 0.88° 0.86" 
Flowchart 0.84° Semi-senior 0.95* 


Note: * p < 0.001; + p < 0.05. 


may not be collected if it is not highlighted by 
the documentation format assigned. This theory 
was subjected to a limited test by deleting from 
the internal control questionnaire questions re- 
lating to controls on back orders and sales terms. 
Only four of the 19 questionnaire subjects — 
one semi-senior and three seniors — inquired 
about controls on back orders. Four and five 
times as much data was requested in this area by 
the flowchart and narrative subjects, respec- 
tively. The null hypothesis that the same amount 
of data is collected by subjects in each format is 
rejected (chi-square = 11.26, 2 df, p < 0.01). 

Even stronger results were obtained from the 
` sales terms controls. Not one questionnaire sub- 
ject collected data on the sales terms offered by 
the client, whereas there were nine and five re- 
quests from the flowchart and narrative subjects. 
Controls in the revenue cycle which were not 
addressed by the questionnaire were rarely in- 
vestigated by the auditors participating in this 
experiment. These findings are consistent with 
those of Fischhoff et al. (1978) and the warnings 
sounded by Dirsmith & McAllister (1982a,b) 
and by Cushing & Loebbecke (1983). 

Overall, the results suggest that the documen- 
tation format induced the subjects — even ex- 
perienced subjects — to examine certain areas 
and to overlook others that were not highlighted 
or prompted by the format assigned. 


Reactions to the experimental task 

In the debriefing questionnaire, subjects were 
asked to report the perceived portion of the task 
completed, the level of interest in the task, the 
perceived level of task difficulty, and the useful- 
ness of the task as a training tool. A review of the 
subjects’ responses to the debriefing question- 
naire suggested that there were no confounding 
factors which may have systematically affected 


their task performance. In general, the subjects’ 
attitude to the experiment was very positive. 
They obviously worked hard throughout the 
morning. Once they had completed the debrief- 
ing questionnaire, most of the subjects stayed to 
discuss the experiment and to offer their views 
on the problems associated with the documenta- 
tion of internal controls. 


CONCLUSIONS, 


In summary, the results of this laboratory ex- 
periment suggest that the quantity of data col- 
lected is affected by both documentation format 
and level of experience. In this experiment, sub- 
jects using the questionnaire collected more 
data than subjects using a flowchart or narrative 
memorandum. Within each format, seniors col- 
lected more data than less experienced auditors. 
The finding that questionnaire subjects covered 
a much greater portion of the revenue transac- 
tions cycle than did other subjects has implica- ° 
tions for audit efficiency. The auditor is charged 
with the responsibility of collecting “sufficient 
competent evidential matter”. As suggested by 

_ Cushing & Loebbecke (1984), it would appear 
that personnel using a questionnaire would 
achieve the sufficiency objective more rapidly 
than those using the other formats tested. 
Second, given that a flowchart or questionnaire 
format is selected, the suffiency objective would 
be achieved more rapidly by audit seniors than 
by their subordinates. 

The second objective of this ui was to con- 
sider whether documentation format and ex- 
perience induced selective perception. The 
results suggest that subjects collected data 
prompted by the format, overlooking other data 
relevant to the audit objective. This has implica- 
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tions for both the sufficiency and the compe- 
tency objectives of data collection. If no data are 
examined in a particular control area, then 
neither the sufficiency nor the competency ob- 
jective are achieved for that area, even though 
the sufficiency objective may appear to have 
been met for the internal control system as a 
whole. 

Overall, these findings should be of some con- 
cern to the auditing profession. First, it appears 
that if only one documentation format is used, 
the structure of the format affects the type of 
data collected, and that significant features of the 
client’s internal control system may be over- 
looked if they are not specifically addressed by 
the format. Eleven of the 12 audit firms 
examined by Cushing & Loebbecke (1983) cur- 
rently tackle this problem by requiring the use of 
two or more formats. (Usually each format is 
prepared by a different member of the audit 
team.) The Venn diagram (Fig. 1), together with 
the Spearman correlations (Table 3, panel A), 
suggest that this approach may be sub-optimal 
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because there is substantial overlap between the 
formats. If two or more basic formats are used in 
the documentation phase of the preliminary 
review, many pieces of information would be - 
collected and recorded at least twice, reducing 
overall audit efficiency and increasing audit firm 
as well as client costs. $ 

The resuits reported here were obtained from 
a laboratory experiment. Although the research 
task strove to approximate the audit environ- 
ment closely, it differed in two important re- 
spects: (1) data was collected from the experi- 
menter, rather than from the client’s personnel 
and (2) the time frame of the experiment was 
constrained. It would be worthwhile to conduct 
another laboratory experiment, or even a field 
experiment, that would overcome these draw- 
backs: That would permit a more detailed exami- 
nation of the relationships between documenta- 
tion format and audit efficiency (especially the 
quality of data collected and the time required) 
and audit effectiveness (especially the type of 
data overlooked by each format). 
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Abstract 


This paper extends previous research on auditors’ choice of information cues by considering whether 
confirmatory strategies are more evident in auditing contexts in which judgments are made sequentially 
and by examining the effect of providing cue diagnosticity on information choice. Consistent with previous 
research, there was only weak support for the confirmatory bias as it was dominated by the consistent 
selection of more failure than viable cues, The comparative preferences for failure compared to viable cues 
was affected by hypothesis framing when prior information (ratios) indicated nonfailure, but not when 
prior information indicated failure. The study also found that the comparative number of failure to viable 
cues chosen is less for subjects given information on cue diagnosticity than for subjects not given this 


information. 


The importance of hypotheses in organizing and 
directing information search by expert decision 
makers has been recognized in the audit judg- 
ment literature (see Libby, 1981). However, 
consistent with much of the psychological litera- 
ture, audit judgment research has concentrated 
on judgments made from given pieces of infor- 
mation and has generally neglected the preced- 
ing phases of hypothesis development and infor- 
mation gathering. Einhorn & Hogarth (1981) 
suggest that the process of information search 
and acquisition should also be considered since 
evaluation and search strategies are interdepen- 
dent. 


One auditing study that has adopted this 
suggestion is that of Kida (1984). He examined 
the effect of hypothesis framing on auditors 


search for, attention to and use of judgmental . 
data. While the psychological studies cited by 
Kida had shown a strong tendency for subjects 
to adopt confirmatory strategies (that is, when 
testing a hypothesis, individuals preferentially 
solicit evidence which tends to confirm, rather 
than disconfirm, their hypothesis), Kida found 
that while “confirmatory strategies were not 
overpowering, the initial hypothesis framing did 
affect data search and use.” Following Kida’s 
suggestion, the first purpose of this paper is to 
examine whether confirmatory strategies may 
be more evident in auditing contexts in which 
judgments are made sequentially as information 
is received. The second purpose of the paper is 
to examine the effect of the provision of informa- 
tion on cue diagnosticity' on auditors’ choice of 
information. 





“We appreciate the useful suggestions of Freddie Choo, Ferdinand Gul, Mark Hirst, Tom Kida and Robert Libby, The research 
assistance of Darcy Becker and Rosemary Stevens is also acknowledged. 


~'Diagnosticity refers to the extent that the conditional probability of the feature given the hypothesized trait is believed to -~ 
be different from that given the alternative trait. ' 0 ALU 
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DEVELOPMENT OF HYPOTHESES 


Hypothesis framing and prior information 

In a series of studies of hypothesis testing pro- 
cesses in social interaction, Snyder and associ- 
ates (Snyder & Swann, 1978; Snyder & Cantor, 
1979; Snyder & Campbell, 1980; Snyder & 
Skryprek, 1981). provided individuals with 


hypotheses about the personalities of other - 


. people and then asked the individual to choose a 
series of questions to ask their targets during the 
forthcoming interviews. Snyder & Gangestad 
(1982) suggest that the message from the above 
research on the hypothesis testing process is a 
clear one: 


As hypothesis testers, individuals systematically formu- 
late and enact confirmatory strategies of preferentially 
gathering evidence whose presence would tend to con- 
firm hypotheses under scrutiny. 


Snyder (1981) suggests that this commitment 
to confirmatory hypothesis testing strategies is 
quite pervasive. In their initial investigation 
(Snyder & Swann, 1978), participants chose to 
ask questions that solicited hypothesis confirm- 
ing evidence about twice as often as they chose 
to ask questions that solicited hypothesis discon- 
firming evidence. Snyder suggests that despite 
diverse attempts to identify circumstances in 
which -hypothesis testers will avoid confirma- 
` tory strategies, subsequent investigation failed 
to yield one circumstance in which hypothesis 
testers avoid confirmatory strategies or to iden- 
tify one procedure that successfully diminishes 
the magnitude of the preferential soliciting of 
hypothesis confirming evidence. 

Studies on hypothesis testing in settings other 
than social interaction also find evidence of the 
confirmation bias (Doherty et al, 1979; Einhorn 
& Hogarth, 1978; Geller & Pitz, 1968; Mynatt et 
al, 1977; Wason, 1968). Waller & Felix (1984) 
review some of these studies in relation to ac- 
counting research and conclude: 


In sum, the main conclusion from the psychological liter- 
ature regarding cognitive strategies for hypothesis test- 
ing is the overwhelming tendency of the ordinary person 


to seek and use data that serve to confirm or verify cur- ` 


rently held beliefs. 
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Kida (1984), using similar research methods 
to Snyder and associates, examined the effect of 
hypothesis framing on auditors search for, atten- 
tion to and use of judgment data. He divided his 
subjects into two groups (known as the “failure 
hypothesis” and “viability hypothesis”) depend- 
ing on whether the information they were given 
was designed to set their initial hypothesis as 
either testing for failure or viability. After read- 
ing a description ofa firm, subjects were asked to 
list the information from the description that 
they considered relevant to decide whether the 
firm would fail within two years (failure hypo- 
thesis group) or remain viable for at least two 
more years (viable hypothesis group). Kida 
found that while the initial framing of the hypo- 
thesis did have an impact on the types of infor- 
mation auditors considered relevant, the results 
did not provide strong support for the existence 
of confirmatory strategies. A strong confirmat- 
ory bias would lead subjects in the failure group 
to list more failure items than viable items and 
for those subjects in the viable group to list more 
viable than failure cues. As subjects from both 
groups listed more failure than viable cues, the 
results were not consistent with a strong confir- 
matory bias. However, there was weaker sup- 
port for the confirmatory bias since subjects in 
the viability group listed significantly more via- 
ble items than subjects in the failure group, and 
when considering the five highest ranked cues, 
the viable group listed fewer failure items than 
the failure group. Kida concluded as follows: 


Perhaps confirmatory strategies would be more evident 
in auditing contexts in which judgments are made se- 
quentially as information is received. For example, sup- 
pose that preliminary data lead the auditor to set a given 
belief about internal control or an account balance. That 
belief may have a stronger effect on the search for new 
data than alternative hypothesis framing, given that both 
supporting and nonsupporting data are potentially avail- 
able. j 


In the present study, two pieces of informa- 
tion, hypothesis framing (viability or failure ) and 
ratio data (strong or weak ratios) were provided 
as preliminary data: The sequential model 
suggested is that auditors combine these two 


AUDITORS INFORMATION CHOICE 


pieces of preliminary data into an initial belief 
about the viability/failure of the organization. 
This initial belief together with cue diagnosticity 
information (when it is available) results in the 
choice of additional information. The initial be- 
lief together with. the additional information 
would then determine the final probability. 

In forming this initial belief auditors received 
one of four possible combinations of the 2 X 2 
matrix for hypothesis framing and prior informa- 
tion (see “Research Methods” for further de- 
tails). Two groups of subjects have noncompet- 
ing hypotheses (“weak ratios/failure hypothesis” 
and “strong ratios/viability hypothesis”). For 
these two situations it is expected that the initial 
beliefs of the two groups will be failure and via- 
bility, respectively. For the two groups faced 
with competing hypotheses (“weak ratios/viabil- 
ity hypothesis” and “strong ratios/failure hypo- 
thesis”), the result is less clear. However, we 
suggest that in both cases, the information indi- 
cating the possibility of failure will have the most 
influence because of the high cost of an auditor 
not identifying a failed firm (Kida, 1984). 

If the combination of hypothesis framing and 
prior information works in this way, the com- 
bined initial hypothesis of subjects with “strong 
ratios/viable hypothesis” will be “viability” 
whereas the combined initial hypotheses for all 
other groups will be “failure”. Strong support for 
the confirmatory strategy would result in groups 
which receive “strong ratios/viable hypothesis” 
choosing more viable than failure items, while 
all other groups will select more failure than via- 

‘ble items. However, given the findings of Kida 
(1984) that, because of auditors’ implicit assess- 
ments of misclassification costs and/or some 
other unknown factors, subjects placed more 
emphasis on failure cues in all conditions, it ap- 
pears likely in this type of audit situation that the 
strong support for the confirmatory bias would 
be partly mitigated by the emphasis on failure 
cues. In this case, a weaker support of the confir- 
matory strategy would be that those subjects 
with the overall initial belief of viability (strong 
ratios/viable hypothesis) will choose a smaller 
percentage of failure cues than those subjects 
with an overall initial belief of failure (via weak 
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ratios and/or failure hypothesis). Operationally, 
this will result in hypothesis framing only having 
an effect when strong ratios are given. When 
weak ratios are given, the subjects already have 
one negative piece of information. 


Hypothesis 1: For subjects given strong ratios, the com- 
parative number of failures to viable cues is affected by 
hypothesis framing, but for subjects given weak ratios, 
hypothesis framing will not have an effect. 


Cue dtagnosticity 

There have been suggestions in the psychol- 
ogy literature (see Trope & Bassok, 1982; Trope 
et al, 1984) that the validity of the results of the 
prior psychology studies finding a confirmatory 
bias may be limited by the nature of their hypo- 
thesis testing task. Trope et al (1984) state that: 


One potentially limiting factor is the kind of questions in- 
cluded in the list from which subjects had to select their 
own interview questions. Specifically, the introverted 
and extroverted questions in the list were all biased, L.e., 
they already assumed the interviewee was either an in- 
trovert or an extrovert, thus making it difficult for him or 
her to express the opposite trait. In all of Sayder and his 
colleagues’ studies, then, interviewers were confined to 
nondiagnostic questions — questions that do not allow 
introverts and extroverts to respond differently. Evi- 
dently, this procedure masks any preference subjects 
may have had for diagnostic questions — questions that 
are expected to elicit different answers from introverts 
and extroverts. ‘ 


Trope & Bassok (1982) suggest that informa- 
tion gatherers, wishing to decide whether or not 
a target person possesses the hypothesized trait, 
should be capable of selecting questions that 
elicit subjectively diagnostic answers. Diagnos- 
ticity refers to the extent a cue can discriminate 
between two hypotheses (Trope & Bassok, 
1982). In a series of three experiments, Trope & 
Bassok varied the probability of the evidence 
under the hypothesized trait and the diagnostic- 
ity of the evidence. The results provided 
“strong” evidence that people gather informa- 
tion by what Trope & Bassok termed the “diag- 
nostic strategy”. The authors also suggest that 
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there was very little evidence for a confirming 
strategy. 

The subjects’ tasks in Trope et al (1984) were 
similar to those in the initial Snyder & Swann 
(1978) study with the exception that subjects 
were free to formulate any kind and as many 
questions as they wished. The questions formu- 
lated were categorized by independent judges. 
The findings of the Trope etal. study were: (i) all 
the questions asked (experiment 1) were 
categorized by the independent judges as poten- 
tially diagnostic; (di) 73% of all questions asked 
were bi-directional or open-ended with the re- 
maining 27% being questions about features of 
either the hypothesized or alternative trait. In 
contradiction to the confirmatory strategy, the 
number of questions about features that were 
consistent with the hypothesis (15% ) was not 
significantly greater than the number of ques- 
tions about features that were inconsistent with 
it (12%); (iii) when provided with no hypo- 
thesis (experiment 2), the subjects did not ask 
significantly more bi-directional or open-ended 
questions (as one would expect under the con- 
firmatory strategy). 

Both of the above two papers conclude that 
individuals do not in general use confirmatory 
strategies but use a “diagnostic” strategy. Under 
this strategy the information gatherer will select 
questions asking about features that are maxi- 
mally diagnostic with respect to the 
hypothesized and alternative traits and that 
these questions should be preferred regardless 
of whether the features are probable or improb- 
able under the hypothesized trait. This would 


imply, in contrast to the confirmatory strategy,. 


that the initial framing of the hypothesis would 
be irrelevant. : 

The second question addressed in this paper is 
whether the provision of information on cue 
diagnosticity will reduce the strong tendency 
for auditors to select more failure than viable 
cues as found by Kida (1984). The following null 
hypothesis is tested: 


Hypothesis 2: The comparative number of failure to via- 
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ble cues chosen by subjects is not affected by the provi- 
sion of information on cue diagnosticity. 


RESEARCH METHODS 


Vartables 

This study incorporates three independent 
variables. The first variable (hypothesis framing) 
varied the hypothesis to be tested. Consistent 
with Kida (1984), the two levels that this vari- 
able took were whether subjects were asked to 
determine if the firm is going to fail within two 
years or alternatively remain viable for at least 
two more years. The second variable (prior ex- 
pectations) was designed to set the prior expec- 
tations of subjects as to the likelihood of failure/ 
viability. Subjects were provided with one of 
two sets of ratios, indicating either strong or 
weak financial positions. The third variable (cue 
diagnosticity ) was varied such that half the sub- 
jects were provided with information on the 
diagnosticity of the cues while the other half re- 
ceived no information on diagnosticity. The 
main dependent variable is F-V where F and V 
are the number of failure and viable cues consi- 
dered relevant by subjects in making their judg- 
ment.” 


Development of research instrument 

As in Kida (1984), the research instrument 
contained a brief description of the company in- 
cluding the fact that a company description had 
been selected at random from a sample of 100 
firms, half of which had failed. In addition, sub- 
jects received information on ratios together 
with 12 other pieces of information. Half the 
subjects also received information on cue diag- 
nosticity. 

To examine the effect of prior information on 
information search, two sets of financial ratios 
were selected. Each set contained five ratios for 
three consecutive years. To ensure that the 
ratios were sufficiently informative for subjects 
to draw the desired inference, the ratios chosen 
were ones that had resulted in accurate predic- 


2F.V was used rather than F/V because of the effect of the situations where subjects chose all F or all V cues (16 subjects). 
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tions in the Libby et al (1987) study. In that 
study, out of a total of 57 subjects, 57 were able 
to correctly predict that the firm with the 
“strong” ratios would not fail and 54 that the firm 
with the “weak” ratios would fail. 

In addition, subjects were provided with 12 
other pieces of information concerning the firm. 
Half of these cues (failure cues) described 
characteristics of the firm which pointed to- 
wards the possibility of the firm failing (for 
example, management indicated that new legis- 
lation may make it difficult to market one of the 
firm’s major products), while the remaining 
cues (viable cues) described those firm charac- 
teristics which indicated the likelihood of con- 
tinued operation by the firm in the foreseeable 
future (for example, the technology of the com- 
pany is competitive with other firms in the in- 
dustry). However, no single piece of information 
was conclusive (Kida, 1984). These 12 cues 
were selected from 20 cues originally tested by 
Kida. Four failure cues and four viable cues were 
omitted from Kida’s set for the following rea- 
sons: (a) by providing subjects with a set of 
financial ratios, four of Kida’s cues became re- 
dundant; (b) since the set of “weak” ratios por- 
trayed the company to be in severe financial dif- 
ficulty, it appeared inconsistent to have a cue 
which stated that management believed addi- 
tional equity capital could be raised through the 
issue of share capital; (c) as the study was ad- 
ministered in Singapore, the cues on labour 
strikes and payment of preference share di- 
vidends were inappropriate; (d) the cue which 
suggested the possibility of a key patent being 
obtained in the near future was suggested by 
practicing auditors to have low frequency, and 
was dropped to allow an even number of cues 
which was necessary in providing information 
on cue diagnosticity (see below). It should be 
noted that these 12 remaining cues were all pre- 
tested by Kida. 

In addition to the above information, half of 
the subjects were also provided with informa- 
tion on cue diagnosticity for the 12 items. As 
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noted earlier, diagnosticity referred to the ex- 


tent to which a cue can discriminate between 
two hypotheses. Ability to discriminate was 
measured in terms of the difference in the fre- 
quency of occurrence of a particular firm 
characteristic for a sample of 50 failed and an 
equal sized sample of viable firms. The fre- 
quency of occurrence of each firm characteristic 
was presented via two bar-diagrams. One dia- 
gram showed the frequency of occurrence of the 
firm characteristic for a sample of 50 viable firms 
and the other diagram indicated the frequency 
of occurrence of that firm characteristic for a 


-sample of 50 failed firms. The greater the differ- 


ence in the frequency of occurrence between 
the two samples, the greater was the diagnostic 
content of the cue. 

Subjects were informed that the percentages 
indicating the frequency of occurrence of the 
various firm characteristics were obtained from 
an actual sample of 50 viable and 50 failed firms. 
However, as this data was not available for many 
cues, the following procedures were followed:? 
(a) four auditors, with audit experience ranging 
from three to ten years, estimated the percent- 
age of viable firms that they expected would pos- 
sess each of the 12 firm characteristics. The aver- 
age percentage estimate of these auditors for 
each firm characteristic was then used as the fre- 
quency of occurrence of that firm characteristic 
for the sample of 50 viable firms;* (b) three via- 
ble and three failure cues were randomly 
selected to become high diagnostic cues, leaving 
the remaining six cues to become low diagnostic 
cues; (c) for high diagnostic cues the range of 
difference in frequency of occurrence between 
viable and failure firms was kept between 32% 
and 38% ; for low diagnostic cues the range was 
between 2% and 8%; (d) the difference in the 
frequency of occurrence of each of the 12 cues 
was randomly obtained from within the above 
ranges and then added to or subtracted from the 
viable companies cue frequencies to obtain the 
cue frequencies for failed companies. For the fai- 
lure (viable) cue the difference was added (sub- 


“The purpose of this manipulation was explained to subjects during the debriefing. 
“The average correlation between the four auditors estimates was r = 0.95. 
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tracted) to insure that the frequency of occurr-. 


ence ofa failure (viable) cue for the sample of vi- 
able firms was always less (greater ) than the fre- 
quency of occurrence of that failure ( viable) cue 
for the sample of failed firms. 


Task 

The first part of the task required subjects to 
list the three ratios from the company descrip- 
tion that they considered most important. This 
part of the task was not analyzed. The purpose 
was simply to: ensure subjects processed the 
ratio information with the expectation that half 
of the subjects would form an initial opinion that 
the company was financially sound (“strong 
ratios” ) and the other half of the subjects would 
form an initial opinion that the company was fac- 
ing financial difficulties (“weak ratios”). The re- 
mainder of the task was based on that used by 
Kida (1984). After reading the firm description, 
subjects were then requested to list the items 
from ‘the firm description that they considered 
relevant in deciding whether the firm would fail 
(failure group) within two years or remain via- 
ble for at least two years (viable group ). Subjects 
were then asked to rank in order of importance 
the five items they considered most important. 
Finally, the subjects were asked for a probability 
estimate of the company remaining viable for 
the next two years (failing in the next two 
years ). 


Subjects 

Eighty Singapore auditors (ten subjects per 
-~ cell) from seven of the U.S.A. “Big 8” firms and 
one established Singapore audit firm partici- 
pated in the study. They had an average of 5.8 
years of audit experience. The auditors of six of 
the firms (55 subjects) completed the tasks 
under one of the researcher’s supervision in 
their respective firms’ conference rooms. Rep- 
resentatives of the other two firms (25 subjects) 
distributed the questionnaires to the auditors 
with instructions to attempt the task indepen- 
dently. Forty-eight percent of these question- 
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naires were collected within two hours and the 
rest in two days. 


RESULTS 


Table 1 presents the mean number of F-V 
cues, failure cues and viabie cues for all cues 
listed and for the five cues considered most rele- 
vant. In addition, the mean probability of failure 
estimates are given. Table 2 provides ANOVA 
results for the 2 X 2 X 2 design with F-V as the 
dependent variable. 

In the development of proposition 1, it was 
suggested that because of auditors’ implicit as-. 
sessments of misclassification costs (Kida, 
1984), auditors would Ín general choose more 
failure than viable items. The initial step in the 
analysis tests this suggestion by comparing the 
number of failure and viable items chosen in 
each of the eight treatment groups. Subjects in 
all groups, except group 5, chose significantly or 
marginally significantly more failure than viable 
items (group 1 : ¢ = 2.14, p = 0.03; group 2 : t= 
3.02, p = 0.01; group 3 : tf = 11.0, p = 0.00; 
group 4: £ = 8.09, p = 0.00; group 6 : t = 1.62, 
p = 0.07; group 7 :¢ = 1.53, p = 0.08; group 8 : 
t = 2.05, p = 0.035). Subjects in group 5, on 
average, chose more viable than failure items but 
the differences were not significant (t = 0.81, p 
= 0.22). Thus, consistent with Kida (1984) the 
persistent emphasis on failure items was found. 

Hypothesis 1 suggests that the extent of em- 
phasis on failure cues compared to viable cues is 
affected by the combined effect of hypothesis 
framing and prior information. Table 2 shows 
that the interaction between hypothesis framing 
and prior information is marginally significant (p 
= 0.07). A diagrammatic representation of these 
results is shown in Fig. 1.> Groups 1 and 5, 2 and 
6, 3 and 7, 4 and 8 are combined in the diagram 
as there were no interactions involving cue diag- 
nosticity. While Table 2 shows a significant main 
effect for hypothesis framing, Fig. 1 suggests that 
this effect is driven by the hypothesis framing X 


‘Results were in a similar direction for the five cues considered most relevant, but the hypothesis framing/prior information 


interaction was not significant (p > 0.10). 
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F-V 


Strong 
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Weak 


- Fig. 1. A diagrammatic representation of the hypothesis fram- 
ing X prior information interaction on F-V. 


prior information interaction. The calculation of 
simple main effects shows that when prior infor- 
mation is strong the hypothesis framing effects is 
significant (F = 12.05, p = 0.001) but when 
prior information is weak, the hypothesis fram- 
ing effect is not significant (F = 0.79, p = 0.38). 

_ Newman—Keuls pairwise comparisons were 
made to further compare the differences. These 
comparisons showed that there was no signifi- 
cant differences between the means for subjects 
that received weak ratios/viability hypothesis 
(groups 2 and 6), strong ratios/failure hypo- 
thesis (groups 3 and 7), or weak ratios/failure 
hypothesis (groups 4 and 8). That is, there was 
no difference between the treatments that re- 


ceived at least one negative signal. However, the 
number of failure compared to viable cues 
selected by the strong ratios/viable hypothesis 
groups is significantly less than for each of the 
other three groups (p < 0.05). 

Table 2 also supports the rejection of the null 
for hypothesis 2. The cue diagnosticity treat- 
ment is significant at the 0.001 level both when 
all cues are considered and when the five most 
relevant cues are considered. Providing informa- 
tion on cue diagnosticity increases the relative 
number of viable to failure cues by increasing 
the number of viable cues and reducing the 
number of failure cues selected, compared to 
subjects who did not have this information. 


TABLE 2. Summary of ANOVA results: F-V 











All cues Five most relevant cucs 

df. ss F P a? ss F  P w? 
Hypothesis framing (HF) 1 43.5 949 0.003 0.09 20.0 3.55 0.064 0.03 
Prior information (PI) 1 9.1 199 0.163 0.02 7.2 1.28 0.262 0.01 
Cue diagnosticity (CD) 1 103.5 22.58 0.001 0.20 1682 2986 0.001 0.27 
HF x CD 1 78 170 0.196 0.02 3.2 0.57 0453 0.01 
PIXCD — i 15 033 0.568 0.00 128 2.27 0.136 002 
HF x PI 1 15.3 3.34 0.072 0.03 98 1.74 0.191 . 0.02 
HF x CD x PI 1 0.0 0.00 0959 0.00 0.2 0.04 0.851 0.00 
Error 72 330.1 405.6 
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TABLE 3. Summary of mean and standard deviations of high and low diagnostic cues 











Hypothesis Prior Mean (S.D.) of cues Mean (S.D.) of five 
framing information selected most relevant cues 
High Low High Low 
diagnostic diagnostic diagnostic diagnostic 
cues cues cues cues 
Viability Financially strong - 3.0 2.3 2.7 
(1.7) (1.9) (1.2) (1.2) 
Viability Financially weak . 3.5 2.8 2.2 
(1.3) (18) (0.9) (0.9) 
Failure Financially strong 3.0 3.1 1.9 
(1.4) (1.6) (1.3) (1.3) 
Failure Financially weak 3.3 2.9 21 





Further information on the effect of cue diag- 
nosticity is shown in Table 3. It provides a sum- 
mary of the number of high and low diagnostic 
cues selected by the 40 subjects in the cue diag- 
nosticity treatments. Subjects chose more high 
diagnostic cues than low diagnostic cues overall 
(t = 2.74, p = 0.01), but when the five most im- 
portant cues were considered this difference 
was not significant (t = 1.49, p = 0.14). On aver- 


age the number of additional diagnostic cues 


chosen compared to nondiagnostic cues is small 
and does not support the dominating effect of 
the diagnostic strategy suggested by Trope & 
Bassok (1982). Together the information in 
Tables 2 and 3 suggests that the provision of in- 
formation on cue diagnosticity has a significant 
effect on the comparative number of failure and 
viable cues but this is the result of more than just 
the use of a diagnostic strategy. 


Additional analysts 

This section considers two areas of additional 
analysis: the effect of the independent variables 
on auditors’ probability judgments and the rela- 
tionship of the information selected to probabil- 
ity judgments: 


(a) Effect of independent variables on proba- 
bility judgments. The last column of Table 1 
shows the mean probability of failure estimates. 





It should be noted that these probability esti- 
mates were made by subjects after they had 
selected their cues and that in selecting these 
cues they had inspected all 12 items. Therefore 
the probability estimates can be influenced by 
the prior information, hypothesis framing, cue 
diagnosticity and the content of all 12 items not 
just those cues selected. 

_ Similar to the F-V variable, the probability esti- 
mate is lower for the viability hypothesis/strong 
ratios (groups 1 and 5) than for the other groups. 
However, this prior probability/hypothesis fram- 
ing interaction is not significant (F = 1.46, p = 
0.23) ina 2 x 2 X 2 ANOVA with probability of 
failure as the dependent variable. The only sig- 
nificant main effect was for prior information (F 
= 6.86, p = 0.01). Neither cue diagnosticity (F 
= 0.14, p = 0.71) or hypothesis framing (F = 
1.96, p = 0.17) were significant. 


(b) The relationship of cues selected to prob- 
ability estimates. Consistent with Kida (1984), 
the relationship of the cues selected to the prob- 
ability judgments was examined by correlating 
probabilities with the number of each type of 
cue listed. Correlations of the subjects’ failure 
probability estimates for all groups with the 
number of failure items, viable items and the dif- 
ference between failure and viable items were 
0.38 (p = 0.001) (Kida = 0.53), —0.29 (p = 


Subjects were asked for either the probability of the company failing or remaining viable depending upon which group they 
were in. The latter were converted to probabilities of failure for comparison purposes. 
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0.01) (Kida = —0.15) and 0.44 (p = 0.001) 
(Kida = 0.44) respectively. . These results 

- suggest that the choice of both viable and failure 
cues is related to the level of probability esti- 
mates. 


SUMMARY AND CONCLUSIONS 


This paper extended Kida’s (1984) study on 
. auditors’ choice of information cues by (a) con- 
sidering whether confirmatory strategies are 
more evident in auditing contexts in which judg- 
- ments are made sequentially and (b) examining 
‘the effect of providing information on cue diag- 
` nosticity on information choice. 

Kida (1984) found that hypothesis framing af- 
fected the relative number of failure and viable 
cues. He also suggested that if preliminary data 
lead an auditor to a particular belief this could 
reduce the effect of hypothesis framing. The 
major finding of the present study is that when 
the prior information indicated failure, hypo- 
thesis framing did not affect the relative number 
of failure and viable cues. However, when the 
prior information indicated nonfailure, hypo- 
thesis framing did have a significant effect. It was 
found that cues chosen, as indicated by the F-V 
variable, were significantly different for those 
subjects given “strong ratios, viable hypothesis” 
to those subjects who received the other three 
combinations of hypothesis framing and prior in- 
formation where at least one failure signal was 
received (that is, weak ratios or failure hypo- 
thesis). Subjects given “strong ratios, viable 
hypothesis” listed more viable items than those 
in the other groups and, in general, listed less fai- 
lure items. 

The results, consistent with Kida (1984), did 
not strongly support the conclusion of Snyder 
and associates as to the robustness and general- 
‘ity of the confirmatory bias. Instead it appears 
that in auditing tasks, of the type considered in 
this paper and in Kida (1984), the main bias of 
subjects is to select more failure than viable 
items. However, the extent of this bias is affected 
by the initial belief (formed from the hypothesis 
framing and prior information) in a direction 
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consistent with the confirmatory bias. Those 
subjects given Some indication of failure chose 
more failure than viable cues. Those subjects 
with the strong ratios/viable hypothesis showed 
a reduced tendency to select more failure than 
viable cues but did not select more viable than 
failure cues as would be suggested by the confir- 
matory bias. It is therefore suggested that there 


‘is only weak support for the confirmatory bias as 


it is dominated by the consistent selection of 
more failure cues. 

The other main finding of the paper was that 
the comparative number of failure to viable cues 
is less for subjects given information on cue diag- 
nosticity than for subjects not given this infor- 
mation. Those subjects with information on cue 
diagnosticity chose, on average, more viable and 
less failure cues than subjects who did not have 
this information. It appears that the information 
on cue diagnosticity decreased the bias towards ' 
the choice of failure items. However, while the 
provision of information on cue diagnosticity af- 
fected the cues chosen, the combined effect of 
hypothesis framing and prior information still 
had a significant effect on the cues chosen by 
subjects with information on cue diagnosticity. 
Subjects given no negative information (“strong 
ratios, viable hypothesis”) have lower F-V scores 
than those groups that received at least one 
piece of negative information. This was mainly 
due to these subjects choosing less failure cues 
(see Table 1). 

It is also noted that while subjects given infor- 
mation on cue diagnosticity chose significantly 
more high diagnostic than low diagnostic cues, 
the fact that 43% (from Table 3) of the total cues 
selected were low diagnostic cues indicates that 
there are other important factors as well as cue 
diagnosticity that are affecting information 
choice. It is possible that the presentation of cue 
diagnosticity information acted as a decision aid 
by alerting them to the need to examine both vi- 
able and failure cues. However, if they did not 
fully believe or understand the diagnosticity in- 
formation, this could explain why diagnostic 
cues were not chosen more often. Given that our 
presentation of data was similar to Trope et al. 
(1982), who found that subjects choose mainly 
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diagnostic cues, it appears that nonbelief of 
some of the diagnosticity information is a more 
likely explanation than not understanding. 

On the basis of the above discussion it is 
suggested that the results of this study fall some- 
where between the two extremes suggested by 
two different groups in the psychology litera- 
ture. First, Snyder and associates suggested 
throughout their studies that the confirmatory 
strategy was extremely robust and their results 
have been interpreted to suggest that “the ex- 
tent to which a feature is diagnostic or differen- 
tially probable under the hypothesis and alterna- 
tive(s) is irrelevant to the confirming strategy 
and should not affect the choice of a feature for 
inclusion in a question” (Trope & Bassok, 1982, 
p. 24). Our results differ from the general find- 
ings of Snyder and associates because of (a) the 
emphasis on failure cues such that under both 
hypotheses subjects selected more failure than 
viable cues and (b) information on cue diagnos- 
ticity was not irrelevant to subjects. Second, 
while cue diagnosticity did have an effect, the ef- 
fect was not as strong as found by Trope and 
associates (Trope & Bassok, 1982; Trope et al, 
1984). Inconsistent with these studies, we found 
that even when cue diagnosticity information 
was given, the combined effect of prior informa- 
tion and framing effects did have an effect on the 
cues selected. Those subjects with information 
on cue diagnosticity who were given strong 
ratios/viable hypothesis chose relatively fewer 
failure as compared to viable cues than those 
subjects who were given some signal of failure 
(either weak ratios, failure hypothesis or both). 

The correlations examined in the additional 
analysis, between the probability of failure esti- 
mates and the FV and F-V variables, were in the 
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same direction and of similar magnitude to those 
found by Kida (1984), except that the relation- 
ship between viable cues and probability esti- 
mates is stronger in our study. However, these 
correlations may substantially underestimate 
the relationship in practice. In this study, sub- 
jects selected the information they considered 
most important from a list of 12 items and in 
doing this read all 12 pieces of information. 
However, in practice, auditors would only have 
access to the information that they decided to 
collect. Thus, the selection of items for informa- 
tion search may be even more closely related to 
probability estimates in practice. i . 

As with all laboratory experiments, the results 
of this study require qualification. In addition to 
the standard validity threats concerning subject 
selection, subject incentives and amount of in- 
formation provided compared to natural set- 
tings, there are two other limitations that should 
be noted. First, the study is limited in that the 
task, consistent with Kida (1984), was to list the 
items from the firm description that they consi- 
dered relevant in deciding whether the firm 
would fail (remain viable). These items selected 
may not be the same as the items that they would 
actually use or even search for, given differing 
costs of obtaining various pieces of information. 
Second, for subjects given information on cue 
diagnosticity, cues were randomly allocated to 
high and low diagnostic categories. While sub- 
jects were told that the diagnosticity informa- 
tion given was taken from a sample of 100 actual 
firms, subjects’ prior expectations on which 
items had high or low diagnosticity in practice 
may have influenced the weight they placed on 
the cues. 
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Abstract 


An experiment was conducted to examine the determinants and consequences of auditors’ perceptions of 
management. Based on the theory of correspondent inference (Jones & Davis, 1965), the following 
hypotheses were formulated: (1) auditors are most likely to make a dispositional inference about 
management when a transaction deviates from expectations and is made under conditions of high choice; 
(2) auditors are more likely to make a dispositional inference about the management of a new client than 
that of a continuing client; and (3) dispositional inferences about management will affect auditors’ 
subsequent judgments. The results support the first hypothesis. Contrary to the second hypothesis, 
subjects’ dispositional inferences about management did not differ significantly between the continuing 
client and the new client conditions. Finally, the materiality threshold of auditors was found to be 
significantly associated with their inferences about management. 


An implicit and yet integral part of the indepen- 
dent auditor’s task involves the assessment of 
management's dispositions (i.e., characteristics, 


attitudes, traits, etc.). This assessment is import- ` 


ant since it presumably influences subsequent 
audit judgments. For example, an auditor’s per- 
ception of management's attitude toward inter- 
nal control is likely to influence the nature, tim- 
ing and extent of audit procedures. A study by 
Kaplan & Reckers (1984) found that the per- 
ceived control consciousness of an organization 
Significantly affected auditors’ preliminary 
evaluations of internal accounting control effec- 
tiveness. Similarly, the scope of the independent 
auditor’s examination would be affected by cir- 








cy 


cumstances that raise questions concerning the 
integrity of management (AICPA 1984, AU 
327.06). 

The assessment of management’s dispositions 
is particularly important in light of the indepen- 
dent auditor’s responsibility for defecting fraud. 
According to the professional standards, .au- 
ditors should recognize that management can 
perpetrate irregularities by overriding controls, 
and should consequently-be aware of the impor- 
tance of its integrity (AICPA 1984, AU 327.09). 

Lea (1981), however, notes that while the 
public accounting profession’s present stand- 
ards reflect most of the recommendations of the 
Commission on Auditors’ Responsibilities (CAR 


*The authors wish to acknowledge the assistance of the Peat, Marwick, Mitchell & Co. and the helpful comments of Steve 
Kaplan, Kurt Pany and Phil Reckers on earlier drafts of this paper. 
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1978), several issues such as the effectiveness of 
. auditing procedures in detecting fraud remain to 
be addressed (p. 56). Connor (1986) similarly 
points to the inadequacy of the current auditing 
standards with respect to fraud detection and 
emphasizes the need for prompt, decisive and ef- 
fective action in order to enhance public confi- 
dence in the accounting profession. Recently, 
this important issue has been addressed more 
extensively, and the result has been several re- 
commendations which are listed in the Report of 
the National Commission on Fraudulent 
` Financial Reporting (1987). Of particular rele- 
vance to public accountants is the need to iden- 
tify steps to improve the auditor's ability to de- 
tect fraudulent financial reporting (see chapter 
3,). Albrecht & Romney (1980) have suggested 
„the examination of managers’ and executives’ 
characteristics as one means for enhancing the 
likelihood of detecting fraud. An investigation of 
` auditors’ perception of management would, 
therefore, be a first step toward a their 
ability to detect fraud. 

The major purpose of this side was to 
examine auditors’ perception of management’s 
dispositions using the theory of correspondent 
inferences (Jones & Davis, 1965) as a frame- 
work. In particular, the a two issues 
were addressed: 


? 


(1) What factors influence auditors’ perceptions of man- 

agement? Specifically, what factors enable auditors to 
form impressions of management? 

(2) How do auditors’ perceptions of management affect 
audit decisions and judgments? This is important since, as 
noted earlier, auditors’ impression of management pre- 
sumably affects their procedures and judgments. 


These questions were examined in the context 
of an issue which involved the appropriate dis- 
closure of an accounting gain. The specific audit 
judgments studied were the perceived impor- 
tance of disclosing the gain according to gener- 
ally accepted accounting principles (GAAP), 
and the materiality threshold that was relevant 
for the given disclosure issue. 

The remainder of this paper is organized as fol- 
lows. First, the theoretical framework and 
hypotheses are developed. The résearch method 
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is then presented followed by the results of the 
investigation. Finally, implications for future re- 
search are discussed along with the limitations 
of the study. 


THEORETICAL FRAMEWORK AND 
HYPOTHESES 


The theory of correspondent inferences 

The theory of correspondent inferences 
Gones & Davis, 1965) is concerned with indi- 
viduals’ attempts to make inferences about the 
dispositions of others by observing their be- 
havior. Correspondence, the main concept of 
the theory, refers to the clarity or directness of 
the relation between the inferred disposition 
and the observed behaviour. Correspondence is 
said to be high if an observer perceives that a 
given action can be due to only one disposition 
(West & Wicklund, 1980, p. 117). For example, 
in the present context, correspondence would 
be high if a given action by management (e.g., an 
overstatement of assets) can only be attributed 
to a disposition of management (e.g., lack of in- 
tegrity). Conversely, if the action can be attri- 
buted to several factors (e.g., unintentional er- 
rors, uncontrollable factors, etc.), correspon- 
dence would be low. This study examined inde- 
pendent auditors’ correspondent inferences 
about management by observing its behaviour. 


Deviation and choice 

In order to infer dispositions from actions, an 
observer will consider the deviation of the 
actions from his expectations as well as the 
actor’s degree of choice in performing the 
actions. In general, behaviors that deviate from 
expectations are more likely to lead the per- 
ceiver to make an inference about the disposi- 
tion of the actor. Conversely, actions which do 
not deviate from one’s expectations generally 
tend to be uninformative and do not. allow the 
observer to make a dispositional inference about 
the actor. 

Jones & McGillis (1976) have distinguished 
between category-based expectations and 
target-based. expectations (pp. 393—394). Cate- 
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gory-based expectations derive from the per- 
ceiver’s knowledge that the target person is a 
member of particular class, category or refer- 
ence group. If an actor is a member of a given 
category, one would expect to observe behavior 
that is consistent with the modal behavior of the 
category. For example, in the present context, 
the independent auditor’s category-based ex- 
pectations about management are defined by the 
professional auditing standards in terms of its 
(management’s) responsibilities [e.g., AICPA 
(1984), AU 110.02]. 

Target-based expectations derive from prior 
information about the specific individual actor. 
These expectations can be inferred from previ- 
ous observations of the consistency of the 
actor’s behavior (for example, auditors’ observa- 
tion of management’s behavior in similar situa- 
tions in the past). Jones & McGillis (1976) note, 
however, that there appear to be no systematic 
differences between the effect of deviating from 
category-based versus target-based expectations 
(p. 398). As in the case of category-based expec- 
tations, behavior which deviates from target- 
based expectations will tend to be more infor- 
mative about the actor’s (e.g, management’s) 
disposition than behavior which is consistent 
with expectations. 

A second factor which determines the likeli- 
hood that an observer will make a dispositional 
inference about an actor is the perceived degree 
of choice in performing the behavior. The more 
behavioral freedom an actor is perceived to have 
in engaging in an action, the more confident the 
observer will be that the action reflects an un- 
derlying disposition. On the other hand, if the 
behavior is perceived to have been performed in 
the presence ‘of environmental pressures, it will 
be unclear whether the behavior was caused by 
Situational or dispositional factors (West & 
Wicklund, 1980, p. 119). 

In summary, according to the theory of cor- 
respondent inferences, an observer is most 
likely to infer that a given action reflects an un- 
derlying disposition of the actor when the latter 
is perceived to have acted freely and when the 
behavior is inconsistent with the perceiver’s 
prior expectations concerning the particular 
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_actor (Jones & Davis, 1965, p. 229). These pre- 


dictions have been tested and supported in so- 
cial psychology [e.g., Jones & Harris (1967), and ` 
Jones et al. (1971)]. Therefore, our first hypo- 
thesis is: 


H1: Auditors are most likely to make a dispositional infer- 
ence about management when a transaction is perceived 
to deviate from expectations and made under conditions 
of high choice (Le., there will be an interaction effect be- 
tween choice and deviation sie expectations on dispos- 
itional inference). . 


Knowledge about the target . 

In examining the impact of perceived choice 
and deviation on the observer’s likelihood of 
making a dispositional inference, one must con- 
sider the extent of the observer’s knowledge 
about the actor. Jones & McGillis (1976) note 
that, ` ` 


If a perceiver has firm prior knowledge about the target 
person and the latter behaves in a highly unexpected 
way, the perceiver may attribute his behavior to the situ- 
ation rather than change his conception of the person. If 
expectations about the situation are firmer than those 
about the actor, there should be a change in person at- 
tribution (p. 400). ` 


Such a boundary condition appears to be rele- 
vant for investigating the auditor—client associa- 
tion issue. Specifically, differences in audit judg- 
ments between a new client and an established 
client may be attributed to the auditor’s differen- 
tial knowledge about the two clients assuming ` 
that causal inferences influence subsequent 
judgments. The independent auditor presuma- 
bly has firm knowledge about the management 


.of an established client whereas he has relatively , 


little knowledge about the management ofa new 
client. Therefore, when deviations from cate- 
gory-based expectations are observed, the au- 
ditor would be more likely to attribute them to 
management when the client is new than when 
it is a continuing one. Accordingly, our second 
hypothesis is: l 


H2: Auditors are more likely to make a dispositional infer- 
ence about the management of a new client than that of 
a continuing one (Le, there will be a client main effect on 
dispositional inference). 
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Consequences of dispositional inferences 

The hypotheses developed thus far have dealt 
with only the determinants of auditors’ percep- 
tion of management’s dispositions. This section 
discusses the consequences of making disposi- 
tional inferences about management. 

Research in attribution theory has, in general, 
provided relatively weak support for the link be- 
tween attributions and consequences [see for 
example, Mitchell (1982), and Kelley & Michela 
(1980)]. In the auditing context, however, it is 
recognized that the independent auditor must 


consider the dispositions of management during 
the course of the examination. For example, 


the auditor should be aware of the importance of man- 
agement’s integrity to the effective operation of internal 
control procedures and should consider whether there 


are circumstances that might predispose management to 
misstate financial statements, (AICPA, 1984, AU 327.09). 


Thus, a causal inference made by the indepen- 
. dent auditor should influence subsequent audit 
judgments to the extent that it is dispositional. 
Given the specific audit judgments examined, 
the third and fourth hypotheses are stated as fol- 
lows: 


H3: Auditors will perceive the disclosure of a given item 
(according to GAAP) to be more important when their 
inferences about management are dispositional than 
when they are not (Le., dispositional inferences and per- 
ceived importance of disclosure will be positively corre- 
lated). 

H4: Auditors will use a lower materiality threshold con- 
cerning a given disclosure issue when their inferences 
about management are dispositional than when they are 
not (ie, dispositional inferences and materiality 
threshold will be negatively correlated). 
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RESEARCH METHOD 


Design 

The three factors (deviation from expecta- 
tions, choice and knowledge about the target) 
were each manipulated at two levels: low devia- 
tion vs high deviation; choice vs no (or low) 
choice; and high vs low knowledge about the 
target (i.e., continuing vs new client). These fac- 
tors were manipulated in a completely crossed 2 
X 2 X 2 factorial design. Figure 1 shows an over- 
view of the design. 


Subjects 

A total of 117 auditors from several offices of 
a large international public accounting firm par- 
ticipated in the experiment at the end of their 
advanced in-charge audit training seminar. One 
of the topics covered in the seminar was the 
firm’s policy concerning materiality judgments 
on audits. Subjects indicated in a posttest ques- . 
tionnaire that they were familiar with the exper- 
iment task. They averaged over three years of 
audit experience. Seven questionnaires were 
discarded due to incomplete answers, thus, leav- 
ing 110 usable responses. 


Procedure 

Subjects each received a case reflecting one of 
the eight combinations of the three independent 
variables (each controlled at two levels). The 
cases were randomly assigned to the subjects 
who were instructed to assume the role of the in- 
charge auditor for a hypothetical manufacturer 
of household consumer products. The audit 
client was described as an SEC firm with an appa- 
rent deteriorating financial position. The case 
further indicated that during the current year, 





` CONTINUING CLIENT 


NEW CLIENT 





CHOICE 


NO CHOICE 


CHOICE NO CHOICE 





LOW DEVIATION 


HIGH DEVIATION 


ee era aa enone 


Fig. 1. Overview of research design. 
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management engaged in a transaction (sale of 
long-term assets) which resulted in a gain. 

After reading the case material, subjects were 
asked to make an inference about the cause (dis- 
positional vs situational) of the transaction, and 
to make two decisions relating to the disclosure 
of the gain. Specifically, they were asked: (1) to 
judge the importance of disclosing the gain ac- 
cording to GAAP; and (2) to indicate the mate- 
riality threshold that they would use for asses- 
sing the importance of disclosure according to 
GAAP. 


Independent variables 

Three independent variables were studied in 
this experiment: deviation; choice; and knowl- 
edge about the target. Each was manipulated at 
two levels (high vs low). 

Both the deviation and choice factors were 
manipulated in the description of the transac- 
tion. The deviation factor was manipulated in 


three ways by allowing subjects to compare the 


given transaction to both category-based and 
target-based expectations. Category-based ex- 
pectations were created by indicating: (1) the 
behavior of management of “other firms in this 
(same) industry” in similar situations; and (2) 

` that the behavior was “justifiable and in the best 
interest of the companies”. With respect to 
target-based expectations, the cases provided in- 
formation about management’s behavior “In 
similar cases in the past”. The transaction de- 
scribed subsequently either conformed to or de- 
viated from the expectations. 

The high choice cases specifically stated that 
management “voluntarily chose” to engage in 
the given transaction, whereas the no (or low) 
choice cases indicated that the transaction re- 
sulted from external factors (i.e., government 
authority). Finally, knowledge about the target 
was controlled by informing subjects that the 
audit client was continuing or new. 

All cases were pretested in a pilot study using 
15 audit seniors of a large international public 
accounting firm other than the one used for the 
actual experiment. Although the sample was too 
small to perform meaningful statistical analyses, 
the feedback obtained enhanced the readability 
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of the case material and enabled the effective 
manipulation of the deviation and choice vari- 
ables. 


Dependent variables 

The first dependent variable measured the ex- 
tent to which the cause inferred by the subjects 
was dispositional. To this end, subjects were 
asked to indicate on two separate 7-point scales 
the degree to which the observed event (i.e., 
transaction) was due to (1) situational factors, 
and (2) management's dispositions. The scales 
ranged from “not at all” (1) to “to a great extent” 
(7). An attribution index (net dispositional in- 
ference) was computed by subtracting the rat- 
ings on the situation scale from the ratings of the 
dispositional scale [see Storms (1973)]. Thus, 
higher net scores indicate more dispositional in- 
ferences. This index was used to test the first two 
hypotheses. . 

Two approaches were used to measure mate- 
tiality judgments. Under the first approach, the 
dollar amount of the given item (gain) was pro- 
vided to the subjects. Further, they were in- 
formed that the given amount of the gain was ap- 
proximately: (1) 7.5% of income before taxes; 
(2) 1.0% of total revenues; and (3) 1.5% of total 
assets. These percentages are within the range of 
commonly used quantitative guidelines for de- 
termining materiality [see for example, Gafford 
& Carmichael (1984)]. Subjects were sub- 


‘sequently asked to indicate the importance of 


disclosing the given gain according to GAAP. A 
7-point scale was used to that end. 

The second approach consisted of asking sub- 
jects to indicate the amount of the gain that they 
considered material for the purpose of deciding 


‘whether disclosure should be made according 


to GAAP. Rather than requesting for the specific 
dollar value, subjects were asked to express the 
materiality threshold as a percentage of income 
before taxes. Given the widely used rule of 
thumb (i.e., 5-10% of income before taxes) for 
determining materiality [see for example, Leslie 
(1984), chapter 4; Gafford & Carmichael 
(1984)], subjects were explicitly requested to 
indicate a specific percentage rather than a 
range. 


_ RESULTS 


This section presents the results of the tests of 
the main hypotheses of this study. Hypotheses 1 
and 2 were tested using ANOVA (see Table 1). 

. The dependent variable was the net disposi- 
tional attribution score which was computed as 
the difference between the dispositional arid 
situational inference scores. Hypotheses 3 and 4 
were. tested using correlational analysis (see 
Table 4). 


Manipulation checks 

_In order to verify the effectiveness of the man- 
ipulation of the choice and deviation factors, 
subjects were asked four questions after making 
the audit judgments. First, they were asked to in- 
dicate the extent to which they felt that manage- 
ment had a choice in engaging in the given trans- 
action. The responses were scored on a 7-point 
scale labeled from “1” (not at all) to “7” (toa 
great extent). A one-way ANOVA indicated that 
the choice manipulation was successful 
[F(1,108) = 7.66; p = 0.006]. 

Similarly, the effectiveness of all three man- 
ipulations of deviation was checked by asking 
subjects to indicate on separate 7-point scales 
the extent to which management’s behavior 
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was: (1) consistent with that of the management 
of other firms in the same industry; (2) in the 
best interest of the company’s stockholders; and 
(3) consistent with its behavior in the past. All 
three manipulations of deviation were success- 
ful with a p-value <0.0001. 


Hypotheses 1 and 2 

The first hypothesis predicted an interaction 
effect between the choice and deviation factors. 
The results presented in Table 1 support this 
hypothesis. Subjects in the high choice/high de- 
viation condition made the most dispositional 
inference about management. The cell means 
are given in Table 2. Higher values indicate a're- 
latively more dispositional (or relatively less 
situational ) inference. í 

Although the choice main effect is also signifi- 
cant and in the expected direction (no choice = 
—0.78, choice = 0.42), it must be interpreted in 
light of the interaction effect. Table 3 sum- 
marizes the results of the tests of simple main ef- 
fects. There was no significant difference bet- 
ween the high and low deviation levels when 


“ management had no choice over the transaction. 


However, in the choice condition, the mean dis- 
positional inference score was significantly 
higher for the high deviation treatment than for 


TABLE 1. ANOVA on net dispositional score 








Source Sum of squares DF F P-value 
Client (CL) 11.74 1 1.76 0.188 
Choice (CH) 39.19 1 5.86 0.017 
Deviation (D) 00.05 1 0.01 0.929 
CL x CH 02.80 1 0.42 0.519 
CLXD 01.53 1 0.23 0.632 
CH x D 38.76 1 5.79 0.017 
CL x CH x'D 00.33 1 0.05 0.823 
Error 2 


| mt 
Q 





682.66 


TABLE 2. Cell means for choice X deviation interaction on 











net dispositional score 
Choice No choice 
Low deviation —-0.18 —0.27 
n=29 n=29 
High deviation 1.03 —1.34 
n=27 n=25 
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TABLE 3. Test of simple main effects 














Source Sum of squares DE F P-value 

Deviation at no choice 14.64 1,52 2.31 0.132 

Deviation at choice 21.72 1,54 3.16 0.080 

Choice at low deviation 0.10 1,56. 0.02 0.898 

Choice at high deviation 75.69 1,50 11.53 0.001, 
the low deviation treatment. Moreover, there Hypotheses 3 and 4 


was no significant difference in mean attribution 
score between the choice and no choice condi- 
tions when the deviation was low. In contrast, 
when deviation was high, the mean dispositional 
inference score was significantly higher for the 
choice condition than for the no choice condi- 
tion. f . 

The second hypothesis predicted a client 
main effect on the net dispositional score. Table 
1 indicates that the client main effect was not 
statistically significant. An examination of the 
cell means, however, reveals that, as predicted, 
the dispositional score for the new client condi- 
tion (0.15) was higher than that for the continu- 
ing one (—0.44). The lack of significant differ- 
ence is inconsistent with the results of a study by 
Bates ef al (1982). The “client” variable was 
controlled differently in that investigation than 
it was in the present study. Therefore, a possible 
explanation for the observed inconsistency is 
that the lack of significant difference may have 
resulted from an ineffective manipulation of the 
“client” factor in the present study. 


The third and fourth hypotheses examined 
the relationship between dispositional inference 
and subsequent judgments. Correlational analy- 
sis has been used to test the link between causal 
attribution and subsequent behavior in both ac- 
counting and nonaccounting contexts [e.g., Kap- 
lan & Reckers (1985) and Mitchell & Wood 
(1980)]. 

Table 4 shows the results of the correlational 
analysis between the net dispositional score and 
materiality threshold, and the-net dispositional 
score and importance of disclosure. Using the 
Kendall Tau B approach which adjusts for ties, 
net dispositional score was found to be signific- 
antly and negatively correlated with materiality 
threshold, but not with importance of disclo- 
sure. Pearson’s statistic was determined to be in- 
appropriate since the distribution of both mate- 
riality and importance of disclosure did not 
satisfy the normality assumption. Table 5 shows 
the means and standard deviations for the four 
choice/deviation conditions. 

The results do not support the third hypo- 


TABLE 4. Correlations between net dispositional score and audit judgments 








Importance Materiality 
Kendall Tau B correlation 
Coefficient —0.005 —0.169 
p-value 0.939 0.028 





TABLE 5. Cell means and standard deviations for choice X deviation interaction on 
importance and materiality scores 








Importance Materiality 
Mean SD. Mean S.D. 
Low deviation/choice 5.12 1.59 718 2.72 
Low deviation/no choice 5.27 1.54 7.83 4.20 
High deviation/choice 5.51 1.61 ` 5.24 2.16 
High deviation/no choice 1.51 7.10 3.70 





5.42 
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thesis. Dispositional inferences about manage- 
ment were not associated with higher perceived 
importance of disclosing a given item according 
to GAAP. One plausible explanation is that sub- 
jects may have based their decision on other var- 
iables which were inadequately controlled in 
the case description. For example, the state- 
ments in all treatment levels reflected a client 
with an apparent deteriorating financial posi- 
tion. Thus, it is possible that subjects across all 
conditions viewed the company as a relatively 
high risk client. Consequently, they all consi- 
dered that it was relatively important to disclose 
the gain according to GAAP regardless of their 
perception of management. Similarly, the lack of 
difference in perceived importance of disclo- 
sure may have resulted from the fact that the 
amount of the gain (7.5% of income before 
taxes) approached or exceeded the materiality 
threshold of the auditors. These interpretations 
appear to be consistent with the results pre- 
sented in Table 5. 

Alternatively, other factors (e.g, perceived 
audit risk, materiality threshold, etc.) may have 
mediated the effect of auditors’ perception of 
management on the importance of disclosure. In 
other words, the effect of auditors’ inferences 
about management on the importance of disclo- 
sure is not a direct one. This possibility is investi- 
gated further below. 

The fourth hypothesis predicted that disposi- 
tional inferences about management would af- 
fect subjects’ materiality threshold judgment. 
While the correlation coefficient was not very 
impressive (—0.16), lower materiality 
thresholds were found to be significantly (p < 
0.02) associated with more dispositional infer- 
ences about management. As Table 5 shows, the 
mean materiality threshold for the choice/unde- 
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sirable condition was lowest. Moreover, the var- 
iation among subjects in that condition (S.D. = 
2.16), was also less than in any of the other con- 
ditions. Therefore, it appears that subjects who 
inferred a disposition of management made 
more homogeneous materiality judgments than 
those who did not. 

An analysis was performed to examine 
whether subjects’ materiality thresholds were 
associated with their perceived importance of 
disclosing the gain item according to GAAP (Le., 
the first audit judgment). This relationship was 
confirmed by the computed Kendall Tau B coef- 
ficient (— 0.45) which was significant at 0.0001. 
Thus, as one would expect, lower materiality 
thresholds were associated with a higher per- 
ceived importance of disclosure. 

The results of the correlational analysis among 
disposition, importance of disclosure and mate- 
riality threshold can be summarized in a simple 
model as shown in Fig. 2. The computed correla- 
tion coefficients show that there is no direct ef- 
fect of auditors’ inferred disposition of manage- 
ment on the disclosure judgment. Rather, sub- 
jects’ materiality threshold appears to have been 
a mediating factor. Clearly, other unobserved 
variables affected the perceived importance of 
disclosure. This may explain the absence of a 
more extreme (higher ) importance score in the 
high choice/high deviation condition despite 
the significantly lower materiality threshold 
(see Table 5). 


DISCUSSION 
The results support the first hypothesis that 


auditors’ inferences about the cause of a transac- 
tion would be most dispositional when the trans- 


-0.005 
(P<0.939 


INFERRED ————————__—_—» MATERIALITY ——————————-> IMPORTANCE 


DISPOSITION -0.169 


(P<0.028) 


THRESHOLD 


-0.45 oF 
(P<0.0001) DISCLOSURE 


Fig. 2. Correlational model of inferred disposition, materiality threshold and importance of disclosure. 
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action is perceived to deviate from expectations 
and made by management under conditions of 
choice. This is consistent with studies by Jones 
& Harris (1967), and Jones et al (1971). 

Although the present study did not control for 
any specific disposition of management, it pro- 
vides evidence that auditors are sensitive to the 
choice and deviation factors. Auditors in the pre- 
sent study were not unlike non-auditors in their 
ability to infer the locus (dispositional vs situa- 
tional) of the cause of a given behavior. If this 
finding can be generalized across auditors and 
audit contexts, the implications for practice may 
be significant. In particular, public accounting 
firms may formally recognize the importance of 
the choice and deviation factors in their audit 
programs as a means to form an impression of 
the management of their client. Moreover, de- 
pending on the nature of the audit judgment, 
these variables can be incorporated in expert 
systems of public accounting firms, developed 
and used for enhancing consistency and consen- 
sus. 
Contrary to the second hypothesis, there was 
no significant difference in causal inferences be- 
tween the continuing and new clients. A likely 
explanation is that the finding may have resulted 
from an ineffective manipulation of the “client” 
factor. Future studies on the auditor—client as- 
sociation issue should first ensure effective con- 
trol of this variable. 

The perceived importance of disclosure was 
not directly affected by auditors’ inferences 
about management. However, this judgment 
was significantly associated with the materiality 
threshold of the subjects. An analysis of the sim- 
ple path model indicates that auditors’ percep- 
tion of management influenced the importance 
of disclosure through its effect on the materiality 
threshold. 

Finally, the results of the present study pro- 
vide some insights into the materiality issues 
raised by Holstrum & Messier (1982). First, sub- 
jects’ materiality threshold expressed as a per- 
centage of income before taxes was influenced 
by their (subjects’) perception of management. 
Auditors’ impression of management is appa- 
rently a qualitative factor which is relevant to 
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materiality judgments. Second, subjects’ mate- 
riality thresholds were more homogeneous 
when they perceived a given transaction to be 
the result of management’s disposition than 
when they attributed it to situational factors. 
These findings are particularly important 
since some public accounting firms, such as the 
one participating in this study, have ‘specific 
quantitative guidelines for delimiting material- 
ity. One would generally expect auditors from 
these firms to be less sensitive to qualitative fac- 
tors than auditors from firms which do not have 
specific quantitative guidelines. This study 
found that despite explicit quantitative guide- 
lines prescribed by their firm, auditors par- 


. ticipating in this experiment attended to qualita- 


tive factors in formulating their materiality . 


‘threshold. A tentative implication of this finding 
_is that currently used materiality guidelines 


based solely on quantitative factors may be in- 
adequate. 


Implications for future research 

This study has provided some insights into au- 
ditors’ perception of management. However, 
before any definitive conclusions can be drawn, 
additional research is needed. The following 
presents some implications for future research. 
First, the theoretical model presented in this 
Paper can provide a basis for investigating how 
auditors infer specific dispositions of manage- 
ment such as integrity, motives and attitudes. 
This is particularly relevant to the issue of impro- 
ving auditors’ ability to detect fraud as discussed 
in chapter 3 of the Report of the National Com- - 
mission on Fraudulent Financial Reporting 
(1987). 

Future studies could also investigate how au- 
ditors’ impressions of management and infer- 
ences about its specific dispositions influence 
other audit decisions such as the assessment of 
risk, the evaluation of internal control, and the 
nature, timing and extent of audit procedures. In 
addition to providing insights into these judg- 
ments, the results of those examinations would 
enable one to establish the validity of the attribu- 
tional framework in the auditing context. 

Another potential use of the model presented 
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in this study is for conducting comparative 
studies. For example, the apparent gap that 
exists between the performance of auditors and 
the expectations of users of financial statements 
(CAR, 1978) can be investigated using such a 
framework. Arrington et al. (1983) and Ar- 
rington et al. (1985 ) examined differences in at- 
tributions of audit responsibility (failure) as a 
means to study the issue of expectations gap. 
The present attribution model can similarly be 
used to determine whether the expectations gap 
is a result of divergent causal inferences about 
management between financial statement users 
and auditors. 


Limitations 

A major limitation relates to the case approach 
used in this study. Subjects were provided with 
only á limited amount of information primarily 
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intended for the manipulation of the indepen- 
dent variables. In the natural setting, other infor- 
mation would be available that may influence au- 
ditors’ judgments. Thus, although the case ap- 
proach provided an effective means of control- 
ling the independent variables of interest, there 
is the inevitable risk that some other significant 
factors might have been omitted, 

A further problem pertains to the study’s li- 
mited external validity. While the hypotheses 
developed in this paper were not meant to be 
context or task specific, in this experiment they 
were tested with respect to a particular audit 
task (disclosure issue), one type of transaction 
(sale of assets), and a specific audit client (a SEC 
firm with a deteriorating financial position). 
Moreover, subjects were auditors of a single 
public accounting firm. Accordingly, any 
generalizations must be made with care. 
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BEYOND THE OBJECTIVIST AND THE SUBJECTIVIST: 
LEARNING TO READ ACCOUNTING AS TEXT* 
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Abstract 


Accounting research, especially that focused on understanding accounting in its organizational context, is 
increasingly recognizing the “subjective” as a realm of interest distinct from the “objective” realm that 
previously had been its predominant concern. This paper argues that although It is a step forward for 
accounting research to recognize both the “subjective” and the “objective” as valid concerns, it is a mistake 
to pose a dichotomy between the two or to suggest that there are two different kinds of researchers 
(objectivist and subjectivist)} who appropriately focus on one realm of experience or another. 

The hermeneutic turn which is taking place in the broader social sciences and which is reflected in the 
appearance of subjectivist accounting research, is properly understood as a rejection of the subjective— 
objective dichotomy. The hermeneutic turn appreciates that our knowledge’ of accounting and 
organizations is not guaranteed by a method that separates the objective from the subjective in order to 
penetrate to the “laws” of the social universe. Instead, our knowledge of accounting and organizations is 
constructed through a social practice in which such distinctions are not meaningful. 

Morgan's recent book, Images of Organization (Morgan, 1986), goes beyond the false dichotomy 
between the subjectivist and the objectivist and presents eight image-based readings of organization. In this 
paper I give a reading of his text that relates it to the hermeneutic turn in the social sciences and explores 
its implications for understanding accounting in its organizational context as well as for doing accounting 
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research. 


Organization theory is generally characterized 
as a field that has progressed through several 
stages of development. We are all familiar with 
such stages as the “classical” school, the “human 
relations” school, the “contingency” school and 
so on. Each new wave of thought about organiza- 
tions was supposedly getting us closer and 
closer to an accurate description of what organi- 
zations really are and how we should best think 
about them. Each new school was seen as a 
broadening and enriching extension of theory, 
leading toward a more global understanding of 
organizations. 


With the publication of Soctological 


Paradigms and Organizational Analysts in > 


1979, Burrell & Morgan gave us cause to ques- 
tion the breadth and diversity of intellectual 
concerns in our discourse on organization 





theory. They synthesized the diverse assump- 
tions that have guided theories on the nature of 
social science and on the nature of society and 
they constructed a grid defined by a subjective— 
objective axis (representing social science 
assumptions) and a regulation—radical change 
axis (representing societal assumptions). They 
then located major_sociological paradigms and 
their associated modes of organizational analysis 
on the grid, arguing that the various “stages of 
development” of organization theory we have 
been reciting are all contained quite tightly in 
one functionalist quadrant reflecting objectivist 
and status quo values. ` 

_ Burrell & Morgan’s (1979) meta analysis of 
the sociological theories that have guided the 
general field of organizational studies also 
helped to reveal the functionalist assumptions 


*The author has benefited from the helpful comments of Ted O'Leary and Peter Miller on an earlier version of this paper. 
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that had explicitly or implicitly guided organiza- 
tional research in accounting (Cooper, 1983). In 


dramatizing an objectivist—subjectivist distinc- . 


tion as the social science axis of their paradigm 
grid, they helped to free our discourse from the 
objectivist corner in which it had been trapped. 
Their book was an important element in a shift- 
ing background of assumptions about social sci- 
ence that helped to set the stage for more 
interpretive research in accounting (Boland, 
1979; Colville, 1981; Tompkins & Groves, 1983; 
Boland & Pondy, 1983, 1986; Covaleski & 
Dirsmith, 1986, 1988; Chua, 1986; Ansari & 
Euske, 1987; Hopwood, 1987; Nahapiet, 1988). 
The result has been an increase in the number of 
“roles” of accounting that are revealed as diffe- 
rent perspectives are taken. Hopper & Powell 
(1985), Ansari & Euske (1987) and Hopper et 
al (1987), review some of these shifting sets of 
accounting roles being studied. 


Although Burrell & Morgan (1979) have con- ` 


tributed to breaking the hold of a crude objec- 
tivism on accounting research, they have also 
laid a trap for those who would take up the new 
subjectivist banner. This is the trap of the object- 
ive—subjective continuum itself. Their grid has 
not only opened up a new space, it has also 
become reified as a kind of fundamental distinc- 
tion that gives a new boundary to our discourse. 
The objective—subjective continuum is a line on 
which we take a position and from which we 
speak. We are either one kind of researcher ( 
jectivist) or another (subjectivist). What was 
useful literary device to break ari old mind set 
has become the basis of a new one that holds us 
as tightly as its predecessor. 
’ Accepting the objectivist—subjectivist 
dichotomy as a basis for describing themselves 
makes uncomfortable and unnecessary demands 
_on the “subjectivists”. The difficulties for the 
emerging subjectivists stem from the implica- 
tion that they are somehow concerned with a 
new and wholly different realm of experience 
than that of traditional accounting research. The 
point to be made is that neither the subjective 
nor the objective can stand alone as an area of 
study and that we need to appreciate the nature 
of their genuine union in the experience of both 
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those who use accounting and those who 
research it (Boland & Pondy, 1983, 1986). 

As in a figure ground relation, each requires 
the other for context in order to be completed 
and to stand out as apart and separate. The ob- 
jective fact is socially constructed and the sym- 
bolic meaning is empirically grounded. The 
photograph as figure is in a sense an “objective 
picture”, but it is grounded in the inténtions of 
the picture taker and it captures a world of ob- 
jects with socially defined meanings. Similarly, 
the subjective power of the Madonna figure as a 
cultural symbol of fertility and nurturing is 
grounded in the physical biology of reproduc- 
tion. 


THE HERMENEUTIC TURN 


Accountants tiring of the traditionalist 
research agenda and interested in exploring this 
emerging subjectivist alternative should be 
aware that the intellectual currents of today are 
breaking the constraints of the subjective— 
objective dichotomy. Some important lines of 
work on this theme are ably presented by Bern- 
stein (1976, 1983) and by Neimark & Tinker 
(1987). Many of these attempts to go beyond the 
subjective—objectivist dichotomy are part of a 
hermeneutic turn in the social sciences. Taking a 
hermeneutic turn in social science involves a 
special appreciation of the close intertwining of 
human action and human language embedded in 
a field of social practice (Taylor, 1971). Her- 
meneutics is the study of interpretation -and- 
originally addressed the problem of interpreting 
ancient religious texts. Such texts are truly alien, 
having been written in different languages by 


unknown authors in unfamiliar cultural con- 


texts. The hermeneutic problem is to gain mean- 
ing from such an alien text by engaging in an in- 
terpretative dialogue with it. 

Taking a hermeneutic turn in the social sci- 


` ences means approaching the social world as a 


text that is alien and unfamiliar: a text with sig- 
nificance and meaning that will emerge only 
through interpretation. The social scientist is a 
reader of that text or, more precisely, a reader of 
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the way social actors read that text to them- 
selves (Geertz, 1972). Thus, theory does not 
stand apart from action as the objective, imper- 
sonal essence of a subjective and personal 
performance. Rather, theory and action are inex- 
tricably bound and emerge from a common field 
of language practice. 

A central figure in this emerging tradition is 
Wittgenstein. He above all others destroyed the 
hope for an ideal, logico-analytic language to 
secure our knowledge of the world. In its place 
he left us an understanding of language as a mul- 
titude of different games we engage in as we in- 
teract with others in our everyday activities. To 
emphasize the intimately reciprocal relationship 


between language as our medium for gaining. 


knowledge and action as our concrete practices 
in the everyday world, Wittgenstein used the 
phrase “language game”. This is to denote the im- 
portance of always considering both the lan- 
guage we use and the actions we engage in while 


using it. 


Uae Ase 
I shall also call the whole, consisting of language and 
the actions into which it is woven the “language 
game”. 
23... 
Here the “language game” is meant to bring into 
prominence the fact that the speaking of language is 
part of an activity, or a form of life (Wittgenstein, 
1954). - 
From Wittgenstein we learn that there is no 
essential, enduring, abstract knowledge of the 
world to be found in language apart from our 
participation in a form of life and our giving of 
meaning to words and propositions in the 
grounded action of our day to day practice. 


241. “So you are saying that human agreement decides 
what is true and what is false?” It is what human be- 
ings say that is true and false; and they agree in the 


language they use. That is not agreement in opin- 
ions but in forms of life (Wittgenstein, 1954). 


Within a “language game” or “form of life”, our 
theories seem to be grounded on the eternal 
essence of things. The accounting objectivists in 
our traditional language game can speak firmly 
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about a solid, enduring kind of truth. They hold 
their theories to be so true because they seem to 
reach down to the ultimate foundations or 
essence of social reality (Rorty, 1979, p. 361). 
But our theories and the language games within 
which we make them are grounded only in our 
social practice. What we take to be reliable, 
rigorous techniques for theory building are con- . 
versations built upon assumptions we have 
tacitly agreed not to question. We cannot escape 
responsibility for constructing and validating 
our knowledge through our social practice. Dec-- 
laring a fundamental distinction between the ob- 
jective and the subjective may help us feel less 
personally responsible for the grounding of our 
beliefs, but in the end it only cheats us from 
exploring other, perhaps more interesting, ways 
of understanding accounting in the social world. 


113. “But this is how it is — ” I say to myself over and 
over again. I feel as though, if only I could fix'my `° 
gaze absolutely sharply on this fact, get it in focus, I 
must grasp the essence of the matter. 

114. . . . -— That is the kind of proposition that one re- 

` peats to oneself countless times. One thinks that 
one is tracing the outline of the thing’s nature over 


and over again, and one is merely tracing round the 
frame through which we look at it. 
115. A picture held us captive. And we could not get out- 


side it for it lay in our language and language 
seemed to repeat it to us inexorably (Wittgenstein, 


1954). 


The theme of textual interpretation that is 
central to the hermeneutic turn in the social sci- 
ences is strongly evident in a recent book by 
Morgan (1986). In Images of Organization he 
explicity adopts the metaphor of reading and 
explores the images used by managers as well as 
researchers to “read the situation” in an organi- 
zation. He does not pretend to give an exhaus- 
tive inventory of such images, nor even a 
taxonomy of the kinds of images that seem to be 
used most frequently. In fact, he does not even 
bother to give close definitions of what he means 


` by an image or how it might be related to a frame 


or a paradigm or any other organizing principle 
of perception and action. Instead, he presents 
eight images that he finds particularly appealing 
and explores how they might be taken as 


594 


metaphors for organization. In eight chapters he 
reads.an organization as if it were: 


(1) 
(2) 
(3) 
(4) 
(5) 
(6) 
‘@) 
_{8) 


a machine; 

an organism; 

a brain; 

a culture; 

a political system; 

a psychic prison; 

a flux and transformation; and 
an instrument of domination. 


- Bach reading portrays the different features 
‘and qualities of organization that are brought to 
light with that image, especially the way diffe- 
rent images shape our understanding of what the 
operational problems of organizations are, what 
. the significant research questions are and how 
we should go about addressing theme. The re- 
sult is a provocative work that has much to offer 
' to members of the accounting research com- 
` munity as they struggle with an established ob- 
jectivist tradition and entertain the possibility of 
a subjectivist alternative. 
` In Images of Organization, Morgan has 
jumped ahead of our current transitional stage of 
discourse on’ organizations. In doing so, he has 
written a book for an age with an established 
sense of reading and textuality, an age that has 
abandoned the-objectivists foundational search 
for an organizational essence and has also moved 
beyond the.relativism that seems to intimidate 
the emerging subjectivist camp. In this paper, I 
will give. a reading of Morgan’s Images of 
_ Organization with particular attention to draw- 
ing out some implications for doing research on 
accounting in its. organizational context that also 
takes a hermeneutic turn. 


SOME THEORETICAL BACKGROUND: 
GADAMER & RORTY 


I say that Images of Organization is written 
for an age with an established sense of reading 
and of the interpretive nature of social science 
because it is a book that is presented without any 
explicit theoretical basis or rationale. To pro- 
vide some background on the hermeneutic turn 
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and the sense of textuality that Morgan im- 
plicitly. draws upon, I will briefly review some 
key ‘points in the thought of Hans George 
Gadamer and Richard Rorty. Both of them are 
important figures for accounting and organiza- 
tional research because they are at the forefront 
of understanding why the distinction. of the ob- 
jective from the subjective as a basis for guaran- 
teeing our knowledge of the world is a mis- 
guided effort and what this implies for the kind 
and quality of knowledge of the social world that 
is possible. Both of them thus offer valuable argu- 
ments for the kind of hermeneutic turn that 
should be adopted by research on accounting in 
its organizational context. 


Gadamer, a distinguished German philos- 
opher only relatively recently translated into En- 
glish, is perhaps the leading figure in developing 
a statement of the universal character of the in- 
terpretive act. Most importantly, Gadamer’s de- 
velopment ofa “philosophical hermeneutics” av- 
oids an appeal to a transcendental essence for its 
truth value. Rorty, a contemporary American 
philosopher, has established himself as a un- 
iquely powerful voice for a hermeneutic turn in 
the social sciences by drawing upon and extend- 
ing the work of Wittgenstein and Heidegger. He 
is in turn relating their positions to that of 
Dewey and thereby reinvigorating American 
pragmatism with a refined sense of how social 
practice shapes and limits our claims to knowl- 
edge. 

Gadamer has already been introduced to 
the information systems (Boland, 1985) and 
accounting (Lavoie, 1987) literatures. Gadamer 
(1975, 1976, 1981) has generalized the her- 
meneutic problem of textual interpretation to 
be the universal problem of achieving human 
understanding in the world. He dramatizes how 
the world that we confront every day is an alien 
one. It has already been made meaningful by 
others in ways unknown to us. The world must 
be interpreted by us if we are to engage in purpo- 
sive action in it. Our interpretation of the world 
is an historic act, grounded in our traditions. 
Gadamer calis the traditions we draw upon in in- 
terpreting the world our prejudices and celeb- 
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rates prejudice as a positive, not a negative, ele- 
ment in our ability to understand the world. 


It is not so much our judgements as our prejudices that i 


constitute our being (Gadamer, 1976, p.9). 


Our prejudice cannot disappear, nor should 
we want it to. Our prejudice is the way we are 
open to the world and the search for truth in 
interpretation is a dialogue that opens our hori- 

- zon of prejudice to that of another. The kind of 
understanding of the world that is possible is not 
an end point that we reach when our prejudice 

“is stripped away, but is rather a moving dialectic 
process of dialogue that always takes place anew 
at the horizon of our prejudice. 

In light of Gadamer, Morgan’s Images of 
Organization can be seen as a primer for the 
array of traditions we draw upon in reading the 
organization as an alien text. Morgan (1986) 
succeeds in opening up the horizon of each 
image or prejudice, to the reader in a convinc- 

ing, sympathetic way. Some who- espouse her- 
`” meneutics find it difficult to break the horizon of 
their own traditions as Morgan appears to have 
done. Lavoie (1987), for exaniple, calls for more 
hermeneutic research but seems to insist that in- 
terpretations of accounting are only valid if they 
agree with his own conviction that economic 
coordination is the essence of accounting. Mor- 
gan (1986) on the other hand, engages happy 
images of economic coordination as well as 
more disturbing images of psychic repression 
with equal openness. In this sense, he is a better 
example of the “fusion of horizons” and ques- 
tioning of our own preconceptions called for in 

Gadamer’s philosophical hermeneutics than is 

Lavoie. 

Gadamer’s (1975) analysis of prejudice and 
tradition is part of his larger criticism of modern, 
positive science. He accuses modern science of 
distorting our concept of language by conceiv- 
ing of the world as comprised of objects to be 
manipulated and of language as simply a tool 
positive science can use to point at, describe and 
operate on the world as it-is simply given to us 
(p. 253). Gadamer’s philosophical hermeneutics 
emphasizes that language as symbolizing is con- 
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stitutive of the world and is not a mere operator 
on it. 


Language is the fundamental mode of operation of our 
being-in-the-world and the all-embracing form of the 
constitution of the world (Gadamer, 1976, p. 3). 

Language is not just one of man’s possessions in the 
world, but on it depends the fact that man bas a world at ,. 
ali (Gadamer, 1981, p. 401). ' 


` The process through which we come to an 
understanding of the world is an interplay of our’ 
tradition and the world-as-a-text, an’ interplay 
known as the hermeneutic circle. The her- ` 
meneutic circle is the recognition that in under- 
standing a text, we depend on a theory of the - 
whole to understand the parts, and, at the same: 
time, we depend on a knowledge of the parts to 
guarantee our theory of the whole. Gadamer is 
careful to point out that the hermeneutic circle 
is not something that disappears once a situation 
is “perfectly” understood. It is not simply a 
method, but is the essential underlying structure 
of understanding. 2S 
The circle, then, is not formal in nature, it is neither sub- 
jective not objective, but describes understanding as the . 
interplay of the movement of tradition and the move- 
ment of the interpreter. The anticipation of meaning that 
governs our understanding of a text is not an act of sub- 
jectivity, but proceeds from the communality that binds 
us to the tradition ... Thus the circle of understanding is 


` not a ‘methodological circle,’ but describes an ontologi- .. 


cal structural element in understanding (Gadamer, 1975, 
p. 261). , 


Rorty (1979, 1982, 1985), in concert-with ` l 


` Gadamer’s philosophical hermeneutics, develops 


a pragmatist critique of modern analytic philoso- 
phy that contains several themes directly relev- 
ant to locating Morgan’s Images of Organiza- 
tion in the changing field of social science 
discourse. One major theme in Rorty’s work is to , 
argue against the quest for a foundational epis- ` 
temology and against the notion that philos- 
ophers should be able to provide us a “perma- 
nent neutral framework” for making accurate 
representations of the world. In place of the 
quest for a seriousness, purity‘and rigor that will 
provide a privileged access to truth, he joins’ 
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Gadamer in suggesting that we’ more humbly 
and appropriately try “... finding a new and 
more interesting way of expressing ourselves, 


and thus of coping with the world” (Rorty, 1979, - 


p. 359). The “linguistic turn” taken by (positive) 
analytic philosophy is credited by Rorty as an 
important impetus to the interpretive, social 
constructionist themes reflected in Morgan’s 
book. It does this involuntarily by helping us to 
see “man as a self changing being, capable of re- 
making himself by remaking his speech” (Rorty, 
1985, p. 104). 


By taking seriously Kant’s notion that the Mirror of 
Nature would always be distorting, and by identifying the 
distorting clement with language, analytic philosophy 
helped make it possible to put aside the whole dialectic 
of Subject and Object, of object and representation, and 
thus all notions of “mirroring,” “distorting” and “con- 
stituting”.(Rorty, 1985, p. 110). 


Rorty argues that we should stop searching for 
objective, ultimate foundations to our knowl- 
edge of the social world and accept that human- 
kind both makes and knows itself and its world 
through social practice. Rorty’s call that we 
come to understand our possibilities for know- 
ing as a socially based conversational hermeneu- 
tic, asks us “to see the making of true statements 
as a piece with the rest of human life, rather than 
as the point at which human life encounters 
demands of the ‘wholly other”. (1985, p. 110). 

Taken together, Gadamer and Rorty are pre- 
conditions for Morgan’s Images of Organiza- 
tion. The movement beyond the objective—sub- 
jective dichotomy that is taken for granted by 
the book relies importantly on the positions they 
represent. Without them as background, the 
reader might continue to worry needlessly 
» about which image is really the “objective” one 

- or, conversely, dismiss the effort as merely “sub- 
jective”. 


A READING OF MORGAN'S READINGS 
The value of Images of Organization stems in 


large part from the careful and convincing way 
each image is used to make a reading of organiza- 
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tion. Each reading is put forward with the con- 
viction of a true believer and none of the images 


-are treated as misguided or superficial. Each 


image is presented as having both strengths and 
weaknesses which Morgan sketches. Each 
metaphoric image of organization contains 
echoes of other, diversely related images be- 
cause metaphors do not stand alone but are 
interwoven in extended metaphorical com- 
plexes (Lakoff & Johnson, 1980). Morgan recog- 
nizes this quality of metaphor and explores the 
reverberating character of the multiple symbol 
systems at work in each image he explores. 

In the introduction, Morgan presents as his 
rationale for the book the belief that successful 
managers are skilled readers of organizational 
settings who act sensibly in the light of the read- 
ings they make. The book is an effort to help 
managers become more aware of the relatively 
few metaphors they use in ordering their read- 
ings and more open and flexible in making alter- 
native readings. Each metaphorical image used 
to read an organizational text will by necessity 
highlight some aspects of the situation and hide 
other aspects (Lakoff & Johnson, 1980). The 
manager as reader is thus urged to abandon the ` 
hope for a single, all-encompassing reading and 
to accept that multiple, conflicting readings are 
the best way to keep our thinking about organi- 
zations as potent and diverse as organizations 
themselves. 

Morgan begins his series of readings with the 
familiar images of mechanism and organism. He 
then develops a series of images that should help 
to reshape our dialogue on organizations. As one 
reads the organization through his eight images, 
the sequence of images seems to follow a pattern 
of escalating reflexivity. 

At the lowest level, the initial mechanical 
image of parts working together is without a self- 
reflective capacity. Next, the organic image of a 
life force unifying the parts introduces a 
rudimentary form of self-referential organization 


_ and development. The brain image then adds 


intelligence and the possibility ofa self-aware in- 
dividual to this image of an isolated system. The 
cultural image introduces a social reflexivity and 
develops an awareness of selfand other in shared 
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community. The political image then compli- 
cates this community by depicting a strategically 
self-reflective individual in conflict with equally 
reflexive and strategically intelligent beings. The 
psychic prison image raises reflexivity further by 
adding the individual’s existential struggle with 
repressed elements of the unconscious as the 
individual tries to define an autonomous self in a 
complex social context. The flux and transfor- 
mation image adds to this struggle the challenge 
of maintaining an enduring conscious identity in 
the face of continual change and ambiguity. 
Finally, the instrument of domination image 
raises the level of reflexivity to critically con- 
front one’s own false consciousness. 

Reading his set of images as a series of escalat- 
ing stages of reflexivity in social systems recalls 
Boulding’s (1956) nine levels of social systems 
(Pondy & Mitroff, 1978), as well as Bateson’s 
(1972) levels of metalanguage in communica- 
tion and Watzlawick et al.’s (1976) use of re- 
flexivity in family therapy. These familiar 
themes, which serve as an unexplored backdrop 
for the experienced reader, links Morgan’s work 


to the best of recent attempts to radically chal-' 


lenge our thinking about organizations. The 
sustained richness and organizational focus of 
Morgan’s progression of images do not just bring 
new life to an established set of ideas on intelli- 
gence, communication and self-reference; they 
are also relevant to more current interest in link- 
ing the deepest micro-level of individual cogni- 
tion ot the broadest macro-level of organiza- 
tional theory. This important emerging theme is 
evident in recent work by Giddens (1979, 1984) 
and by Miller & O’Leary (1989). I will review 
each of the images explored by Morgan very 
briefly and for each of them I will develop the 
implications for understanding accounting in its 
Organizational context. 


Machine 


The machine image is the most familiar and - 


frequently invoked image for understanding 
accounting in organizations. The organization as 
machine is designed for precise function and 
efficient task performance. Tasks are enumer- 
ated, isolated, simplified and assembled into sets 
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of routine procedures with closely timed opera- 
tions and flows. A good organization, like a good 
machine, is one that runs without friction and 
consistently produces prespecified results. 
Accounting under a machine image serves the 
manager by providing facts in a reliable, dispas- 
sionate way. The organization runs according to 
standardized procedures and a calculus of 
minimum resource consumption per unit of out- 
put. Accounting systems are designed to help 
decompose tasks ‘into. micro-processes. AC- 
counting systems create transactions for each 
separate activity and those transactions record 
the details of task performance, including re- 
source consumption and output counts. Ac- 
counting transactions and reports form a hierar- 
chy that parallels the hierarchy of the function- 


ing parts of the organization. 


Organism 

The organism is the principal rival to the 
machine as a model for understanding account- 
ing in organizations. The organization as 
organism is an open system living in an environ- 
ment. It adapts to, but also evolves interactively . 
with, its environment. As an organism, an organi- 
zation seeks to develop its inherent potentials 
and to maintain a sense of organizational health. 
The logic of organization thus shifts from a logic 
of its internal functioning to a logic of its fitness 
within an ecological niché: ` 

Perceived in terms of an organism image, 
accounting systems should become internally 
contingent and externally sensitive. Internal re- 
porting is modified as the complexity of the task 


‘and the turbulence of the environment changes. 


Accounting data is changed to fit the stage of 
development of individual and organization 
needs. The accounting data set is expanded to in- 
clude environmental scanning and to refiect:a 
strategic sense of the organization’s place in its 
environment. The organization will succeed in 
evolving and adapting to the extent that its ac- 
counting system evolves and adapts to changing 
internal and external circumstances. 


Brain Pigana : 
The brain image of organization is increas- 
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‘ingly familiar to the accountant entering the 
information age. The brain image focuses on the 
learning and intelligent behaviour of an or- 
ganism and views the information system as a 
central nervous system for the organization and 

“as a basis for its cognition. Here Morgan plays on 
the holographic quality of brains, with each part 
containing the knowledge of the whole, to 
emphasize the learning capacity of redundant, 
highly interconnected organizational systems. 
Cybernetics provides a structural principle for 
organizing feedback loops into learning systems. 
Learning proceeds through first order feedback 
loops, while learning to learn proceeds through 
second order feedback or double loop learning. 

Accounting under a brain image becomes the 
core of the organization’s nervous system. 
Organizational control is now seen as an emer- 
gent property of the monitoring systems and 
communication channels of the accounting in- 
formation system. Above we saw that the 
machine image portrays the chart of accounts, 
cost procedures and report formats as guiding 
and monitoring the efficient functioning of parts. 
The organism image, in turn, portrays them as 
serving the contingencies of adaptation and 
evolution. But the brain image breaks with the 
surface-level functional emphasis of both the 
machine and the organism. Instead, it directs our 
attention toward the network of information 
stores and flows embodied in the accounting 
systems and suggests that a quality of intelli- 
gence and a type of cognition is an emergent 
property of such a network. Accounting system 
design is then neither a question of monitoring 
compliance and efficiency nor a question of 
guiding adaption. Accounting system design ‘is 
instead a question of creating a network of inter- 
connections that is rich and complex, yet discip- 
lined enough to display intelligent behavior. By 
improving an accounting system design, we are 
improving the memory and learning capacity of 
an organization. 


Culture 

The image of organization as culture brings 
multiple intelligences and the process of creat- 
ing shared meanings among them, into focus. 
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The shared basis of the taken-for-granted and the 
everyday become problematic. How are they 
achieved? How are they sustained? These 
become the major questions for understanding 
organizations. Viewing organization as culture 
highlights the myths, rituals. and story-telling 
that other images ignore. Organizations don’t 
just function, adapt or learn; they act in those 
ways with respect to things that have meaning 
and things have meaning because they have 
been made meaningful. When organizations are 
viewed as culture, it is a question of how things 
are made meaningful that is the key to under- 
standing them. 

Seen in such a manner, accounting becomes a 
central element in the shaping of organizational 
reality. Accounting is a principal and ubiquitous 
ceremonial function in organizations. From the 
highest level strategy and budget review process 
to the lowest level transaction approval and 
countersigning process, the accounting system 
is at work celebrating economic rationality, con- 
firming privileges of rank, reflecting structures 
of authority and embodying our dreams of effi- 
ciency and purposeful coherence. The account- 
ing system and its cycle of events also provides a 
structural space for organizational sense-mak- 
ing. It makes available a common legitimized 
vocabulary of economic calculations, it sequ- 
ences moments of management planning and re- 
view and it provides a common version of or- 
ganizational history. All of these features are im- 
portant in making managerial dialogue in the 
form of rational decision-making possible. An ac- 
counting system under the organization as cul- 
ture image becomes a sense-making space 
within which organization members identify 
and talk about significant events and themes. Ac- 
counting systems help in this way to structure 
and enable a community search for meaning. 


Political system 

The image of an organization as a political sys- 
tem is a more recently emerging view for the 
accountant. It focuses on the modes of gover- 
nance employed in organizations and it reveals 
the multiple interests that motivate individuals 
and groups, the conflict inherent in organization 
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life and the dynamics of power underlying or- 
ganizational action. 


Accounting under a political system image is - 


seen as a reflection of the distribution of organi- 
zational power, masking the struggle for re- 
source control under its display of economic ra- 
tionality. The accounting system hides gender, 
racial and other forms of discrimination behind 
the objective “facts” of its official categories. But 
especially, accounting imposes a unitary view on 
a strongly pluralist process. Accounting systems 
‘allow only one of the many competing versions 
of an organization’s economic reality to be legit- 
mized. The accounting system can thus silence 
economic representations from those political 
positions outside the main power structure. 


Psychic prison 

The image of an organization as a psychic 
prison introduces a number of new ways of 
understanding organizations to the accounting 
community. The psychic prison image focuses 
on the ways that organization processes are 
manifestations of the human unconscious, espe- 
cially our deep seated struggles to establish and 
justify a concept of self. Drawing on the Freudian 
tradition, Morgan makes a striking analysis of 
Frederick Taylor as an anal compulsive whose 
life-long obsession with precise regularity, neat- 
ness and order shaped modern organizations 
through the form of scientific management. Mor- 
gan draws on Foucault (1979) to argue that or- 
ganization and sexuality are intimately linked in 
the process of control. Organizational control in 
the monastery, the factory and the school are 
fundamentally based on the regulation and con- 
trol of the body. 

Accounting under the image of organization 
as psychic prison takes on powerful new mean- 
ings. Accounting manifests repressed sexuality 
through its surface level stereotype of anal com- 
pulsive features, such as the constant search for 
balance, the precisely ordered columns, the 
strict attention to time-deadlines, the search for 

_and elimination of deviance, etc. But more 
importantly, accounting reflects our individual 
` struggle with establishing and maintaining self. 
Accounting in this sense is part of a compulsion, 
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. thoroughly internalized and operating through 


the subconscious, in which we try to produce 
ourselves and to allow ourselves to be produced 


.as useful bodies and lives. 


My personal preference is to read accounting 
through the psychic prison image as a confronta- 
tion with our own finitude and inevitable death. 
We each encounter an existential sense of dread 
from contrasting the immortal quality we ex- 
perience in our thoughts, our minds and our 
sense of conscious self with the finite, temporal 
experience of our body and the realization of 
our unavoidable death. Accounting as a human 
practice is a way of writing our lives in transac- 
tions and records that promise to live forever 
and, in that way, overcome our finitude. The 
texts we create of our lives through our account- 
ings can be everlasting. They can live indefi- 
nitely, attesting to the importance and signifi- 
cance of our fleeting moments of action. Our 
lives thus become immortal though the self- 
documenting texts of our accounting systems. 


Flux and transformation 

The image of organization as flux and transfor- 
mation opens another new dialogue for accoun- 
tants on the ways organizational systems are 
continuously self-organizing and self-reproduc- 
ing. Building on the notion of a complex 
cybernetic system, organizations are portrayed 
as displaying a circular network of mutually 
causal relations with both deviation amplifying 
and deviation reducing feedback loops. Here, 
Morgan provides an excellent discussion of au- 
topoiesis or the self-producing logic of a closed 
set of complex relations. The image of flux and 
transformation is then further developed to 
explore the dialectical character of social or- 
ganizations and the essential contradictions in- 
herent in them. Seen as a dialectic, organiza- 
tional events are the visible manifestations pro- 
duced by a deep structure of multi-leveled con- 
tradictions. Nested sets of primary and secon- 
dary oppositions (labor vs capital, male vs 
female, culture vs nature, self-interest vs social 
needs, etc.) continuously alternate between 
foreground and background, each creating the 
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conditions for and giving rise to, each other, as in 
the unfolding of an implicate order. 

The image of organization as flux and transfor- 
mation highlights the accounting system as a 
practice that enables us to freeze the flux, as it 

‘were and to avoid confronting organizational 
contradictions. Accounting thus becomes not so 
much a vehicle to reflect our subconscious as a 
vehicle for avoiding coming face to face with the 
deeper, generative processes of organization. In 
place of a continuously unfolding dialectic of 
oppositions and contradictions, the accounting 
system affirms a fiction of organization as cohe- 
rent entity, corporation as strong, stable ego. 
The accounting system serves to delimit the 
organization from the non-organization, to 
‘separate subunits from one another and, in gen- 
eral, to establish boundaries that give the appear- 
ance of an integral, single unit with functional 
cohesion. The constant turmoils below the sur- 
face is masked by the counter-image of a consis- 
tent organizational identity for which the ac- 
counting is being made. 


Domination 

The image of organization as domination is the 
last of Morgan’s readings and explores how the 
exploitation of individuals and the achievement 
of organizations are inextricably linked. An 
organization’s authority structure is a mode of 
legitimized domination. Workers, the environ- 
ment, national political systems and underde- 
veloped countries are all objects of exploitation 
that enable the economic achievement of or- 
ganizations. Organizations impose stress on 
workers and environment alike as pollution, oc- 
cupational disease and the deskilling of labor be- 
come accepted as inherent by-products of 
organizational accomplishment. 

Accounting under the image of domination is 
portrayed as an active element in partitioning 
work processes and deskilling the individual, in 
failing to capture and report externalities and in 
playing transfer-pricing games to exploit under- 
developed nations. With the domination image, 
as with the psychic prison and the flux and trans- 
formation images, the readings of accounting are 
not pleasant ones. In contrast to the self-con- 
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gratulatory prose that marks most professional 
and academic portraits of accounting, these last 
three images expose its participation in a sys- 
tematic avoidance of representing the uglier 
face of organization. Seen through the domina- 
tion image, accounting reveals its preference for. 


„reinforcing the happier, light-hearted pictures of 


organization and suppressing the violent and the 
oppressive, but no less real experience of living 
in them. Under an image of domination, account- 
ing is portrayed as the written record of the false 
consciousness induced by participation in mod- 
ern organizations. 


IMPLICATIONS FOR ACCOUNTING RESEARCH 


Morgan (1986) explores his images of organi- 
zation in a way that leaves each intact, important 
and believable. He carefully avoids denigrating 
or idolizing any of them, for he recognizes that 
each image is ideologically informed and none 
alone is adequate for representing organizations. 
Organizations are all these things and more 
simultaneously. Morgan’s message is to break 
from our slumber of absolutist, singular theories 
of organization and to become skilled at more 
subtle reading than any “school” or “contin- 
gency” model currently provides us. In the final 
analysis, reading an organization is not a passive 
observation of it, but an active construction or 
enactment of it. This is the kind of reading as 
praxis that Morgan hopes to inform in both man- 
agers and researchers. His central message is that 
both groups should take their reading and writ- 
ing of organizations and their accountings more 
seriously. The implications for accounting 
research that I will draw from this text all center 
on metaphor and the making of meanings: mean- 
ings made by organizational actors who engage 
accountings and meanings made by researchers 
as they study accounting in its organizational 
context. . 

Images of Organization dramatizes the im- 
portance and power of metaphor in shaping our 
understanding of the interrelation of accounting 
and organizations. An extended exploration of 
metaphorical complexes such as the book pro- 
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vides helps sensitize the reader to the ubiquitous 
nature of metaphorically structured dialogue as 
it is employed by the actors in organizations and 
by those who research them. As one reads the 
eight images in succession, one begins to sense 
that Morgan slowly loses the fineness of explica- 
tion that marks the early chapters on machine 
and organism as he moves along through the 
later images. But, it may instead be that the 
reader becomes better able to see metaphorical 
references and better able to follow metaphori- 
cal implications as he or she goes through the 
chapters. As one becomes a more experienced 
reader of the metaphors that structure our ex- 
perience, one can better sense the unexplored 
possibilities in Morgan’s later metaphors, which 
are less well entrenched in our vocabulary. 

Metaphors, following Lackoff & Johnson 
(1980), highlight some aspects of a situation 
while hiding others. They operate by revealing 
the similarity of relationships across different 
domains of experience and, in so doing, they 
provide us with conceptual structures. Since any 
one metaphor is always partial, these metaphori- 
cally structured concepts are multiple, diverse 
and potentially conflicting. Metaphors are not 
just a colorful way of expressing ourselves, but 
underlie our everyday cognitive structures. We 
think and act every day on the basis of metaphor- 
ically structured concepts and, in this sense, we 
cannot not use metaphor. 

One research issue for accounting scholars is 


to continue the program that Morgan has begun. 


The images he introduces need further develop- 
ment of their potential readings, field work on 
their actual use and impact, and analysis of their 
assemblage as metaphorical complexes. When 
we look at accounting research in this light, we 
see a number of dominant metaphors that are 
not addressed in Images of Organization or in 
his subsequent work (Morgan, 1988). The 
metaphors of organization-as-transaction and or- 
ganization-as-contract (or nexus of contracts) 
and the metaphor of information as commodity 
that so thoroughly predominate current at- 
tempts to transform organization theory and ac- 
counting into economics are conspicuously mis- 
sing. 


If Morgan were to succeed in debunking the 
imperialist, monolithic strains of organizational 
analysis that stand most strongly against his call 
for a pluralist, interactionist approach, one 
would think that these three metaphors would 
be central targets. Because he avoids these 
metaphors, the efforts to recast organization 
theory as transaction costs, principal agent rela- 
tions and information economics go untouched. 
But they clearly need to be deconstructed 
(Boland, 1986; Neimark & Tinker, 1987; Ar- 
rington & Francis, 1989). They are metaphors 
parading as rigorous analytic models that will 
too easily escape analysis as examples of lan- 
guage practice and interpretive reading until 
such metaphorical unpacking is accomplished. 

If research on the structuring of our own dis- 
course is to be serious and productive, it must 
unpack not only the metaphorical structures of 
the dominant, established theories, but those of 
their emerging challengers as well. All theories 
we draw upon, friend and foe alike, should be 
recognized as problematic and temporary struc- 


‘tures and subject to systematic unpacking and 


critical reflection. 
A second research implication for understand- 


‘ing accounting in organizations is the general 


theme of reading and textuality that Images of 
Organization exemplifies. Our research can 
profitably become more orientated toward the 
making and the using of accountings as a text. 
Our research might then be more open to 
Rorty’s (1979) message that we stop our quest 
for universal truth and be content to engage in 
good, interesting conversations. He implies that 
we should give up searching for the ultimate 
foundations of laws that operate as people make 
and use accountings and instead accept the uses 
of accounting and the doing of our research as 
practices for which the accepted topics, 
methods and standards of proof are socially con- 
structed. A hermeneutic reading of accounting 
as text is the most hopeful way to approach an 
organizational understanding of accounting as a 
human practice. 

A third area of research suggested by Morgan’s 
work is to better understand the process of mak- 
ing and changing meanings, of structuring and 
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restructuring problems. Understanding the pro- 
cesses through which participants frame and 
reframe a situation is an important part of under- 
standing the making of meanings. How are 
images involved in the maintenance of meanings 
over time? How are situations invested with new 
- and different images and meanings? An apprecia- 
tion for the metaphors used by actors in the 
reading and re-reading of their experience 
would bring a new richness to our fieldwork. 
How do the multiple, contradictory images at 
work in an organization clash, mesh or reinforce 
each other as people give meanings to account- 
ing in the situated practice of organizational life? 


Morgan’s conclusion is that we should learn to. 


“imaginize” a different, more open future for 
accounting and organizations by putting our 
` familiar images aside and exploring new ones. 
But this conclusion is not convincing as it stands. 
It fails to appreciate the strength of tradition be- 
hind our prejudice (in Gadamer’s sense) for the 
_images that we habitually use. The images 
through which an individual is open to reading 
an organization and the horizons of his or her in- 
terpretive frames are deeply rooted in a particu- 
lar familial, cultural and economic history. Such 


traditions cannot be switched as facilely as Mor- 


gan suggests. Morgan should more clearly con- 
front the strength and resilience of the traditions 
behind the various images he explores. The re- 
Silience of i.: .ages makes research on how they 
are maintained and changed all the more import- 
ant. 

. The resilience of our images as a prejudice 
suggests a final research area, that of an action 
research program in which metaphors are used 
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` as organizational interventions. Just as Morgan's 


images can be helpful for stimulating a more 
critical awareness of the metaphors implicit in 
the practices and theories of our research, they 
can also be helpful for stimulating and guiding 
self-reflection on the implicit structures that are 
employed by organizational actors who make 
and use accountings. This kind of action re- 
search would enable us to make interpretive 
readings as organizational members themselves 
explored the metaphorical structuring of their 
own conceptual systems and, most importantly, 
as they confronted and attempted to engage new 
metaphors in a process of organizational learn- 
ing. 

Images of Organization should add to the 
quality and interest of our conversation on the 
interrelation of accounting and organizations. 
Although the theoretical persepctives of the her- 
meneutic approach that it reflects must be 
sought elsewhere, the book nonetheless pro- 
vides us with a rich source of challenging 
images. Perhaps most important is the way the 
book is written as if it were for an age with an 
established sense of reading and textual analysis: 
an age that takes it for granted that accounting in 
organizations is as much or more a question of 
literary criticism and hermeneutic analysis than 
it is a question of economic imperatives. There is 
something to be said for this presumption of an 
era that is not yet our own. Here is hoping that 
others also adopt this presumption and get on 
with the business of making better interpretive 
conversations in our research. This will happen 
as we do more such studies and begin talking 
critically about them. 


BIBLIOGRAPHY 


Ansari, S. & Euske, J. K., Rational, Rationalizing and Reifying Uses of Accounting Data in Organizations, 
Accounting, Organizations and Society (1987) pp. 549-570. ` 

Arrington, E. & Francis, J., Letting the Chat Out of the Bag: Deconstruction, Privilege and Accounting 
Research, Accounting, Organizations and Soctety (1989) pp. 1-28. 

Bateson, G., Steps to an Ecology of Mind (New York: Ballantine Books, 1972). 

Bernstcin, R. J., The Restructuring of Social and Political Theory (New York: Harcourt Brace Jovanovich, 


1976). 


Bernstein, R. J., Beyond Objectivism and Relativism: Science, Hermeneutics and Praxis (Philadeiphia: 


University of Pennsylvania Press, 1983). 2 
6ST 





BEYOND THE OBJECTIVIST AND SUBJECTIVIST 


Boland, R., Control, Causality and Information Requirements, Accounting, Organizations and Society . 


(1979) pp. 259-272. 

Boland, R., Phenomenology: A Preferred Approach to Research on Information Systems, in Mumford, E, 
Hirschheim, R., Fitzgerald, G. and Wood-Harper, A. T. (Eds) Research Merbods in Information Systems, 
pp. 193—201 (Amsterdam: North Holland, 1985). ` 

_ Boland, R., Fantasies of Information, in Neimark, M, Merino, B. and Tinker, T. (eds) Advances in Public 
Interest Accounting, pp. 49—65 (Greenwich: JAI Press, 1986). 

Boland R. & Pondy, L. R., Accounting in Organizations: A Union of Rational and Natural Perspectives, 
Accounting, Organizations and Society (1983) pp. 223-234. 

Boland, R. & Pondy, L R., The Micro Dynamics of 2 Budget-cutting Process: Modes, Models and Structure, 
Accounting, Organizations and Soctety (1986) pp. 403—422. 

Boulding, K., The Image (Ann Arbor: University of Michigan Press, 1956). 

Burrell, G. & Morgan, G., Sociological Paradigms and Organizational Analysis (London: Heinemann, 
1979). 

Chua, Wai Fong, Radical Developments in Accounting Thought, Tbe Accounting Review (1986) pp. 601— 
632. 

Colville, 1., Reconstructing “Behavioral Accounting”, Accounting, Organizations and Society (1981) pp. 
119-132. 

Cooper, D., Tidiness, Muddle and Things: Commonalities and Divergencies in Two Approaches to 
Management Accounting Research, Accounting, Organizations and Soctety (1983) pp. 269-286. 

Covaleski, M. A. & Dirsmith, M. W., The Budgetary Process of Power and Politics, a lll 
Organizations and Society (1986) pp. 193-214. 

Covaleski, M. A. & Dirsmith, M. W., The Use of Budgetary Symbols in the Political Arena: An Historically 
Informed Field Study, Accounting Organizations and Soctety (1988) pp. 1~24. 

Foucault, M., Discipline and Punish (New York: Vintage, 1979). 

Gadamer, H.-G., Truth and Method (New York: The Seabury Press, 1975). 

Gadamer, H.-G., Pbilospbical Hermeneutics (Berkeley: University of California Press, 1976). 

Gadamer, H.-G., Reason in the Age of Science (Cambridge: MIT Press, 1981). 

Geertz, C., Deep Play: Notes on a Balinese Cockfight, Daedalus (1972) pp. 1—37. 

Giddens, A., Central Problems in Social Theory (London: MacMillan, 1979). 

Giddens, A., The Constitution of Society (Berkeley: University of California Press, 1984). 

Hopper, T. & Powell, A., Making Sense of Behavioral Research into Management Accounting, Journal of 
Management Studies (1985) pp. 429—465. 

Hopper, T., Storey, J. & Willmott, H, Accounting for Accounting: Towards the Development ofa Dialectical 
View, Accounting, Organizations and Society (1987) pp. 437-456. 

Hopwood, A., The Archeology of Accounting Systems, Accounting, Organizations and Society (1987) pp. 
207-234. 

Lakoff, G. & Johnson, M., Metaphors We Live By (Chicago: Untversity of Chicago Press, 1980). 

Lavoie, D., The Accounting of Interpretation and the Interpretation of Accounts: The Communicative 
Function of the Language of Business, Accounting, Organizations and Soctety (1987) pp. 579-604. 

Miller, P. & O'Leary, T., Hierarchies and American Ideals, Academy of Management Review (1989) 
forthcoming. 

Morgan, G., Images of Organization (Beverly Hills: Sage 1986). 

Morgan, G., Accounting as Reality Construction: Towards a New Epistemology for Accounting Practice, 
Accounting, Organizations and Society (1988) pp. 477-485. 

Nahapiet, J, The Rhetoric and Reality of an Accounting Change: A Study of Resource Allocanion, 
Accounting, Organizations and Soctety (1988) pp. 333—358. 

Neimark, M. & Tinker, T., Identity and Non-identity Thinking: A Dialectical Critique of the Transaction Cost 
Theory of the Modern Corporation, Journal of Management (1987) pp. 661—673. 

Pondy, L. & Mitroff, I., Beyond Open Systems Models of Organization, in Staw, B. M. and Cummings, L. L. 
(eds) Research in Organizational Bebavior (Greenwich: JAI Press, 1978). 

Rorty, R. Philosophy and the Mirror of Nature (Princeton: Princeton University Press, 1979). 

Rorty, R, Consequences of Pragmatism (Minneapolis: University of Minnesota Press, 1982). 

Rorty, R, Epistemological Behaviorism and the Detranscendentalization of Analytic Philosophy, in 
Hollinger, R. (ed.) Hermeneutics and Praxis pp. 89—121 (Notre Dame: University of Notre Dame Press, 
1985). 

Taylor, C., Interpretation and the Sciences of Man, Review of Metaphysics (1971) pp. 3-51. 


Pom «* 


603 


RICHARD J. BOLAND, Jr 


Tomkins, C. & Groves, R., The Everday Accountant and Researching his Reality, Accounting, Organizations 
, and Soctety (1983) pp. 361-377. f 
Watzlawick, P., Beavin, J. & Jackson, D., Pragmatics of Human Communication (New York: W.W. Norton, 
1976). 
Wittgenstein, L. (Trans. by Anscombe, G.E.M.), Philosophical Investigations (Oxford: Basil Blackwell, 
1974 [1954])." 


Accounting, 
Organizations 
and Society 


EDITOR-IN-CHIEF: ANTHONY G. HOPWOOD 


VOLUME 14 1989 


| Pergamon Press - Oxford - New York- Beijing - Frankfu 
S80 Paulo - Sydney - Tokyo - Toronto 


Accounting, Organizations end Society : 


EDITOR-IN-CHIEF 


Anthony G. Hopwood: London School of Economics and Political Science, BORENO Street 
London WC2A 2AE ` 





ASSOCIATE EDITORS . 


` Jacob G. Birnberg Peter Miller 
Graduate School of Business Department of Accounting 
- University of Pittsburgh, and Finance 
Pittsburgh f London School of Economics 


EDITORIAL BOARD 


4 


Shahid L. Ansari Meinolf Dierkes Edward E. Lawler II Hela 


Schreuder 
School of Business Administration and Science Center School of Business University of Limburg, Maastricht 
Economics West Berlin Administration 
California State University University of Southern California Michael Shisids 
Northridge i Mark Diremith School of Accountancy 
College of Business Administration Robert Libby San Diego State University 
Chris Argyris Pennsylvania State University Johnson School of : 
Graduate School of Business Management j Robert J. Swieringa 
Administration Kenneth R. Ferris Cornell University Financial Accounting Standards 
Harvard University Cox School.of Business ; Board 
Southern Methodist University James G. March 
C. Arrington Graduate School of Business Sadao Takatera 
College of of Business Administration Eric G. Famiats Stanford University peel of Economii 
Iowa Graduate School of Management o Universi 
mies University of California Charles Medawar 7 
Stan Baiman Los Public Interest Research Centre Willem Waller 
Graduate School of Industrial Loadon College of Business and 
Connie Melon University Grad Sh of Managemen Kenacth A. Merchant l irakite y mae 
i ity uate School t A. niversity of Arizona 
Rutgers University Graduate School of Business 
Administration Joba H. Waterhouse 
College of Commerce and B Harvard Faculty of Business 
Business Administration University of Limburg, Maastricht and Commerce 
University of Illinois Joba W. University of Alberta 
. Department of Sociology 
Peter Brownell Graduate School of Business Stanford Unive: Karl E. Welck 
School of Economics University of Chicago School of Business Administration 
and Financial Studies Theodore J. Mock University of Michigan 
Macquarie University Tı School of ting 
À Department of iversity of Southerh California Aaron Wildavsky 
Edwia H. Caplan Accounting and Finance Survey Research Center 
bert O. Anderson School of University of Manchester Ted 0O’ . L beobeteie of California 
Business and Administrative College of ree and Business . 
Sciences eo, = E. Jensena ` $ PES 
University of New Mexico Department of Business Administration University of Illinois Mayer Zaid a 
Trinity University 5 4 Department of Sociology , ` 
David T. Otley : University of Michigan 
Faculty of Busi Sten Jonsson - Department of $ 
University of Alberta Department of Business Accounting and Finance 
j Sotbenbing Univesity University of Lancaster . 
eremy Dent Gothenburg 
London Business School 





. Publishing, Subscription and Advertising Offices: Pergamon Press plc, Headington Hill Hall, Oxford OX3.0BW 
(Oxford 64881; Telex 83177). 
Annual Subscription Rates 1990 (including postage and insurance) 


Annual institutional subscription rate (1990) DM 770.00. 2 year institutional rate (1990/91) DM 1463.00. Personal sub- 
scription for those whose library subscribes at the regular rate (1990) DM 240.00. All subscription enquiries should be 
addressed to: The Subscription Fulfilment Manager, Pergamon Press plc, Headington Hill Hall, Oxford OX3 OBW, U.K. 
Prices are subject to change without notice. 

Back Issues: Back issues of all previously published volumes, in both hard copy and on microform, are available direct from 
Pergamon Press offices. 


Published 6 times per annum 

Copyright © 1989 Pergamon Press pic : 
It is a condition that manuscripts submitted to this journal have not becn published and will not be simultancously submitted or published elsewhere. By submitting a 

the authors agres that the copyright for their article is transferred to the publisher if and when the article is accepted for publication. However, asalenment 
of copyright is not required from authors who work for organizations which do not permit such assignment. The copyright covers the exclusive rights to reproduce and 
distribute the article, including reprints, photographic reproductions, microform or any other reproductions of similar nature and translations. No part of this publica- 
tion may be reproduced, stored in a lll eu hd irs a any form Or by aol electrostatic, magnetic tape, mechanical, photocopying, 
recording or otherwise, mtboat permition in weaning. from; the copyright ho 


US Copyright Law Applicable to Users ia the USA 

Photocopying information for users in the U.S.A. The Item- Eee Ooi ie asi toca eas ata io piao Fena in Dia tein 
granted by the copyright bolder for Hbraries and other users registered with the Copyright Clearance Center (CCC) Transactional Reporting Service provided the sta- 
ted fee for copying boyon tha! permitted by Sertion 107 or TOR of tha ee aes Taw, is paid. The appropriate remittance of $3.00 per copy per article is 
paid directly to the Copyright Clearance Center Inc., 27 Congress Strect, Salem, MA 01970. 

Permission for other use. The copyright owner's consent docs not extend to copying for general distribution, for promotion, for creating new works, or for resale. Spe- 
cific written permission must be obtained from the publisher for such copying. 


The ltem-Fee Code for this publication is: 0361~3682/89 $3.00 + .00 
Corie text paper used in this publication stl the minimuma requiremems of American National Standard for Information Sciences — Permanenos of Paper for 
Dii Mugia ANS 730 dIe 


VOLUME 


CONTENTS 


Number 1/2 
C. E. ARRINGTON and J. R. FRANCIS 1 Letting the chat out of ‘the bag: deconstruction, privilege and 
: accounting research 
B. CZARNIAWSKA-JOERGES and 29 Budget in a cold climate 
B. JACOBSSON 
D. V. DEJONG, R. FORSYTHE, 41 A laboratory investigation of alternative transfer pricing 


JAE-OH KIM and W. C. UECKER 

T. SELLING and J. SHANK 65 
Research on Audit Judgment 

- T. J. MOCK 81 


P. E. JOHNSON, K. JAMAL 83 
and R. G. BERRYMAN 


mechanisms 

Linear versus process tracing approaches to judgment model- 
ling: a new perspective on cue importance. 

Introduction 


Audit judgment research 


L. R. BEACH and J. R. FREDERICKSON © 101 Image theory: an alternative description of audit decisions 
J. BEDARD 113 Expertise in auditing: myth or reality? 
G. F. KLERSEY and T. J. MOCK 133 Verbal protocol research in auditing ‘ 
K. V. PINCUS , 153 The efficacy of a red flags questionnaire for assessing the 

possibility of fraud . 
J. SHANTEAU . . 165 Cognitive heuristics and biases in behavioral auditing: review; 

i . comments and observations ein, ie 
; : i t 2 taS 
W. S. WALLER and, W. L. FELIX, JR '. 179- Auditors’ causal judgments: effects of forward vs ‘backward 
. ; ee “a ast ; inference on information processing 
Number 3 
201 Editorial Announcement 

P. D. BOUGEN : 203 The emergence, roles and consequences of an accouiting- ` 

industrial relations interaction 
P. DANOS, D. L. HOLT and 235 The use of accounting information in bank lending decisions 
E. A. IMHOFF, JR i i i 
L. A. GORDON 247 . Benefit—cost analysis and resource allocation decisions” 
A. HARRELL, M. TAYLOR and , 259 An examination of management’s ability to bias the profes- 
E. CHEWNING sional objectivity of internal auditors 
T. PINCH, M. MULKAY and M. ASHMORE 271 


Clinical budgeting: experimentation in the social sciences: a 
drama in five acts ; 


ii. ` 


BEN-HSIEN BAO and DA-HSIEN BAO 
A. S. DUNK 


O. A. EMOISILI 
K. KEASEY and R. WATSON 
_L. MIA 


J. M. PETERS, B. L. LEWIS and 
V. DHAR 


P. F. LUCKETT and M. K. HIRST 
M. PRESTON 
A. J. RICHARDSON 


M., WALKER 
P. F. WILLIAMS 


Studies of the Cognitive Aspects of Accounting 
J. L. BUTT and T. L. CAMPBELL 


F. CHOO 
L. R. DAVIS 


G. DESANCTIS and S. L. JARVENPAA 


. J. M. HASSELL and C. E. ARRINGTON 


S. E. KAPLAN and P. M. J. RECKERS 


S. E. C. PURVIS 
K. T. TROTMAN and J. SNG 


B. WONG-ON-WING, J. H. RENEAU and 
S. G. WEST 


Number 4 


303 LIFO adoption: a technology diffusion analysis 
321 Budget emphasis, budgetary participation and managerial 
performance: a note 
325 The role of budget data in the evaluation of managerial 
performance 
337 Consensus and accuracy in accounting studies of decision- 
making: a note on a new measure of consensus 
347 The impact of participation in budgeting and job difficulty on 
managerial performance and work motivation: a research note 
359 Assessing inherent risk during audit planning: the develop- 
ment of a knowledge based model 
Number 5/6 
379 The impact of feedback on inter-rater agreement and self 
insight in performance evaluation decisions 
389 The Taxman cometh: some observations on the interrelation- 
ship between accounting and Inland Revenue practice 
415 Corporatism and intraprofessional hegemony: a study of 
regulation and internal social order 
433 Agency theory: a falsificationist perspective 
455 The logic of positive accounting research 
- 471. The effects of information order and hypothesis-testing 
strategies on auditors’ judgments 
481 Cognitive scripts in auditing and accounting behavior 
495 Report format and the decision maker’s task: an experimental 
investigation 
509 Graphical presentation of accounting data for financial fore- 
casting: an experimental investigation 
527 Acomparative analysis of the construct validity of coefficients 
in paramorphic. models of accounting judgments: a replication 
and extension 
539 An examination of information search during initial audit 
planning 
551 The effect of audit documentation format on data collection 
- 565 The effect of hypothesis framing, prior expectations and cue 
diagnosticity on auditors’ infomation choice 
Auditors’ perception of management: determinants and 


consequences 


iv 


Biblioscene 


R. J. BOLAND, Jr 591 Beyond the objectivist and the subjectivist: learning to read 
accounting as text 


I Volume Contents and Author Index for Volume 14 


~ AUTHOR INDEX 


Arrington, C. E. 1, 527 


Ashmore, Me 271 
' Beach, Le Re 101 
Bedard, J. 113 


Ben-Hsien Bao 303 
Berryman, R. -G. 83 
Boland, R. J. Jr 591 
Bougen, P. D. '. 203° 


Butt, J. L. 471 
. Campbell, T. L. 471 
Chewning, E. 259 ` 


‘Choo, F. 481 
‘Czarniawska—Joerges, B. 
` Da-Hšien Bao 303 
Danos, P. 235- 


Davis, L. R. 495 
DeSanctis, G. 509 
Dejong, D. V. 4I. 
Dhar, V. 359 
Dunk, A. S. 321 


Felix, W. L. Jr 179 
Forsythe, R. 41 


Francis, J. R. 1 


29 


‘Frederickson, J. R. ~ 101 ` 


Gordon, Le A. 247 


Harrell, ‘A. 


259 
Hassell, J. M. -527 
Hirst, M. K.- 379 
Holt, D. L. . 235 


Imhoff, E; A. Jr | 235. 
Imoisili, 0. A. 325 . 
Jacobsson, B. 29 
Jae-Oh Kim 41 


Jamal, K.: 83 E 
Jarvenpaa, S. L. ` 509 
Johnson, P. E. ` 83 


Kaplan, S. E. 


539 

Keasey, K. 337.. 
Klersey, G. F. 133 - 
Lewis, B.e L.. 359 
Luckett, P. Fe. 379 
Mia, Le 347 - 
Mock, T. J. ` 81, 133 

` Mulkay, Me 271 
Peters, J. M. ..359 
Pinch, T. 271 
Pincus, K. V. “153 
Preston, A. M. 389 
: Purvis, S. E. C. 551 
Reckers, P. M. J. 539 
Reneau, Je H3 577  . 
Richardson, A. J. -© 415 


Selling, T. 65 
Shank, J. 65 «=. 
Shanteau, J. 
Sng, J. 565 


Taylor, M. . 259 
Trotman, K. T. 


Uecker, W. C.. 41 


433 


Walker, M. 
Waller, W. S. - 179 
Watson, R., 337 
-West, S. G. 577 


Williams, P. F., 455 
Wong-On-Wing, B. 577 


